fvisin / dataset_loaders

A collection of dataset loaders
GNU General Public License v3.0
195 stars 63 forks source link

Add implementations for data loaders in popular frameworks #14

Open ieee8023 opened 7 years ago

ieee8023 commented 7 years ago

It would be great if data loaders were implemented in popular frameworks so that all the datasets here could be accessed with a provided adaptor. Instead of wrapping this data loader myself I could just call a function to generate a data loader for some dataset in some framework. For example these data loaders:

MXNet: http://mxnet.io/api/python/io.html PyTorch: http://pytorch.org/docs/torchvision/datasets.html TensorFlow: https://www.tensorflow.org/extend/new_data_formats Fuel: https://fuel.readthedocs.io/en/latest/h5py_dataset.html

fvisin commented 7 years ago

Hey Joseph! I agree it would be a great addition. I don't have much time these days to work on this kind of stuff unfortunately, but contributions in this direction are more than welcome of course. If someone is willing to give it a try I can definitely join the design phase and I can also probably find some time to help with part of the implementation.

For what concerns TensorFlow, I wrote a "main loop" that feeds your model (that you can define easily in a separate project) with the data coming from the dataset loaders and takes care of some of the usual stuff (early stopping, logging, loss/gradients/samples in tensorboard, ..). It has an extensive parametrization accessible from the command line and, as a bonus, supports multigpu: https://github.com/fvisin/main_loop_tf/