openml / openml-data

For tracking issues related to OpenML datasets
1 stars 1 forks source link

Non-existent column in ignore_attribute in DSD #41

Open mwever opened 3 years ago

mwever commented 3 years ago

The dataset okcupid-stem with ID 42734 mentions a column in the meta-data attribute oml:ignore_attribute which does not actually exist in the dataset. More specifically, it says to ignore the column "last_login" but as can be seen here https://www.openml.org/d/42734 there is no such column.

PGijsbers commented 3 years ago

Looks like it's a left-over from the previous iteration. We can't currently remove ignore_attribute fields through the API if there's already a task coupled to the dataset.

@joaquinvanschoren should we consider allowing changes like these to a dataset if none of its tasks have any recorded runs? There's an edge case where someone can have local runs for that task (potentially to be uploaded later), but I think that's maybe an edge case we should not support in favor of avoiding the sheer amount of duplicate datasets we'd be creating (I recently had another dataset with a similar issue).