tensorwerk / stockroom

Version control for software 2.0
https://tensorwerk.com
Apache License 2.0
64 stars 5 forks source link

script for loading pytorch datasets straight into stockroom #17

Closed jjmachan closed 4 years ago

jjmachan commented 4 years ago

Description

Why is this change required? What problem does it solve? Describe your changes in detail:

This adds a script that will help import any pytorch dataset into stockroom. Currently it supports the following datasets

  1. Mnist
  2. Fashion-mnist
  3. Cifar-10

Usage

To run the script you have to go to the directory where you have to init the head.stock and call the script from there

takes the following args:

  1. [dataset name]
  2. --root : this is the current path.
  3. --dataset-path : if the dataset has already been downloaded you can give its path here. Make sure it was downloaded using the PyTorch dataset object.

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

CLAassistant commented 4 years ago

CLA assistant check
All committers have signed the CLA.

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 904904e1d291dbe37c1f104c0c93869c5c4d7132 into 752a6bee370b0e84690650485815950f6f8f5b40 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 2d72fddc4161b39c774bd1b7064c8906ec675389 into 3466a5ee473c39e03eb87ed0733361f1ed680f4b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging fc2a8d2f3310575c17a49b39e5e3a91ac2169207 into 3466a5ee473c39e03eb87ed0733361f1ed680f4b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 9bafc40ad784c2ca40bde117f4621c46fa02c5b2 into 3466a5ee473c39e03eb87ed0733361f1ed680f4b - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 2 alerts when merging 93ed9e3555494e59fefa5704cf6a19c1dcd92c2f into 3466a5ee473c39e03eb87ed0733361f1ed680f4b - view on LGTM.com

new alerts:

hhsecond commented 4 years ago

Documenting what I and @jjmachan had discussed offline: I think it's better to make this script as part of the plugin system that can generalize addition of any dataset, like an ImageFolder similar to hangar.external but comes with a few of the datasets prebuilt. Infact, we can utilize this idea in hangar by writing new plugins for popular datasets. What do you think @rlizzo

jjmachan commented 4 years ago

what do you guys think of stock import <plugin name> <plug-in args>

so we will have commands like stock import csv --file path/to/file stock import imagefolder path/to/file stock import torchvision.MNIST -d (download flag)

rlizzo commented 4 years ago

Documenting what I and @jjmachan had discussed offline: I think it's better to make this script as part of the plugin system that can generalize addition of any dataset, like an ImageFolder similar to hangar.external but comes with a few of the datasets prebuilt. Infact, we can utilize this idea in hangar by writing new plugins for popular datasets. What do you think @rlizzo

I think this is probably the right idea in the long term, but I wouldn't focus on any of that development right now. I only say that because the hangar plugin system was a huge task to build, and has sat basically unused since.

I would treat this like an independent package which is mixed into the stockroom core namespace for the time being. I think that'll let us build something which is both clean and performant while not complicating things with issues around installation (entry point hooks / externally loading modules).

hhsecond commented 4 years ago

@rlizzo agreed. Let's do that.

hhsecond commented 4 years ago

what do you guys think of stock import <plugin name> <plug-in args>

so we will have commands like stock import csv --file path/to/file stock import imagefolder path/to/file stock import torchvision.MNIST -d (download flag)

import seem to be nice but let's not make it through plugins as Rick suggested above?

lgtm-com[bot] commented 4 years ago

This pull request introduces 3 alerts when merging dce043deffe6cc5f4a21cfdb261e47829f94ceae into 774f345b36330e8746e767712baa60befe600cc6 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 1 alert when merging 4479a4d374b68dab4311ff80a9a9101b19a9cb1d into dcc9fae6c5537646ccead1aa18d5ec459c3c48c5 - view on LGTM.com

new alerts:

lgtm-com[bot] commented 4 years ago

This pull request introduces 1 alert when merging 087dcfe5a57df6bd3dcee23fad8e8e468a5274fc into dcc9fae6c5537646ccead1aa18d5ec459c3c48c5 - view on LGTM.com

new alerts: