Closed lacava closed 4 years ago
The UCI link gives the pre-processed data. I think it's more straightforward if we just cite this link from openml.
I can't save my changes on your colab notebook, so I made a copy to verify they're the same here.
We also need to remove the instance
column because it's a row identifier, as mentioned in #19 (see profiling report). I can help with that.
this (unfinished!) PR removes the promoters dataset because it is a duplicate of molecular_biology_promoters (issue #19). I also began to add metadata to molecular_biology_promoters, but I have not yet gotten the source data to exactly match our version. It also appears that the class labels are reversed which we might want to fix.
here is a colab notebook where i'm working on source verification