Added fn to select data loader (instead of Mnist hardcoded in train.py)

victoresque / pytorch-template

PyTorch deep learning projects made easy.

MIT License

4.7k stars 1.08k forks source link

Added fn to select data loader (instead of Mnist hardcoded in train.py) #15

Closed borgesa closed 5 years ago

borgesa commented 6 years ago

Hi,

Created initial proposal for selecting data loader in configuration file.

Overview:

User can select data loader in configuration file
New data loaders are (for now*) written in 'data_loader.data_loaders.py'
Currently, the function 'get_data_loader' needs to be manually when adding new loaders (in addition to writing the loader class)

I propose this as a first step that we can build further on**.

Comment: *: I guess we over time can implement functionality to load loaders from new files inside the 'data_loader' module (if someone prefers to create 'data_loader/my_new_loader.py') **: I plan to implement another data loader that potentially can work as another example in the template. Will post when this is done.

victoresque commented 6 years ago

Hi,

Nice work! These are some good improvements to have, I'll merge this PR when it's done. I wonder if you could also update the "Data Loader" part in README.md with your updated version? Thanks

borgesa commented 6 years ago

Hi @victoresque ,

Sure, I can update the README. I guess the three bullets I wrote above are more or less sufficient?

Additional question: I began looking at adding a data loader and realized that I actually do not need a data loader, but only a dataset class. I may use the torch built in data loader (at least as a start).

Where are custom dataset classes intended to be placed? Currently the folder 'datasets' is intended to contain actual data...

Two alternatives I see: 1) Place both datasets and loaders into 'dataloader' folder 2) Make the standard folder for data a new folder 'data', then use the 'datasets' folder the place to place custom dataset classes

What do you think?

victoresque commented 6 years ago

Hi,

I prefer the second alternative, since the name data is more general than datasets

If data loaders are not needed to be re-implemented (use torch built-in), maybe you can put datasets and data_loader into a folder data_utils?

Then the folder structure would be somewhat similar to:

 ...
  ├── data_utils/ - data utilities
  │   ├── data_loaders.py
  │   ├── datasets.py
  │   └── your_custom_dataset.py
  ├── data/ - default data folder
  ...

or even simpler:

 ...
  ├── data_utils/ - data utilities
  │   ├── data_utils.py
  │   └── your_custom_dataset.py
  ├── data/ - default data folder
  ...

You can try to figure out the better way among these options, thanks

borgesa commented 5 years ago

Hi @victoresque and @SunQpark ,

If you merge this pull request, I can do the requested update of the README.md separately (easier workflow for me, as I merged from my master branch).

Hope this is ok with you.

SunQpark commented 5 years ago

OK, I'll do that

borgesa commented 5 years ago

Thank you @SunQpark. Next pull request sent (updating loss.py).