Closed neeleshdodda44 closed 6 years ago
@georgymh Incorporated the changes you asked for. Couple things:
If you can modify the 2 MNIST CSVs accordingly
Since you didn't explicitly tell me more information, I assumed we were talking about the MNIST dataset, in which case I followed the schema here: https://www.kaggle.com/c/digit-recognizer/data Let me know if this is not the case.
Also, where in the wiki would you like me to make the note about the CSVs?
Yes, that's perfect. Feel free to "squash and merge" whenever you're ready.
For the wiki, you can put it under the "Dataset" section in Software Engineering > Products > Data Provider Unix Service.
Two major changes:
Labeler is now an integer. So the
_create_dataset_iterator
method initerators.py
only accepts integers in thelabeler
argument. I changed the tests for runner so that themake_train_job
andmake_validate_job
methods made jobs whose labelers were 0.Cleaned up the `_create_dataset_iterator' so that it uses dataframes in processing the data. Hopefully it's a bit easier to read now?
Suggestions:
Hopefully this is pretty straightforward and this can get merged quickly.