mila-iqia / blocks-examples

Examples and scripts using Blocks
MIT License
147 stars 94 forks source link

Arrangement of files in the examples (__init__.py, __main__.py?) #1

Open dwf opened 9 years ago

dwf commented 9 years ago

I'm worried that people will be confused by all the example scripts being called __init__.py, and think this is required or something. Maybe we should mention this in the readme or something?

bartvm commented 9 years ago

I guess we can mention it in the README. I'm not sure what the best way of packaging the scripts is actually. I named them __init__.py so that they become packages that are importable (important for testing), and because the alternative is to have a very redundant reverse_words/reverse_words.py, which would mean importing from reverse_words.reverse_words.

A related issue I ran into: Because of Python 3's import change, the line from .dataset import MarkovChainDataset doesn't work if you're trying to execute markov_chain/__init__.py (or whatever file in that directory) as a script, because you can only do relative imports in packages. This mean that the script now needs to be run with python -m markov_chain instead of python markov_chain/__init__.py.

dwf commented 9 years ago

For the stuff that's self-contained in one file, why not just foo.py, no folders? And for the stuff that isn't, can it be? Or else can we could include a "run.py" script with absolute imports in it, making clear that the folder needs to be on your python path.

bartvm commented 9 years ago

Folders mostly because I figured we could add a README file to each with some details of what the model does, setup/commits needed to make it run, etc. GitHub nicely renders these README files then. Many scripts also produce saved models and such, and this way they are saved to a specific folder as well, so that the top folder doesn't become a mess (or different examples start overwriting each other's models).

Most of the examples we currently have could potentially be put in a single file, but there will probably be many that can't e.g. the machine translation code is supposed to be added soon, and that's way too big to put in a single file.

Requiring the folder to be put on the path with run.py is an option, but then I actually prefer the package approach i.e. adding a __main__.py so that people can just run python -m example. That seems slightly cleaner.

rizar commented 9 years ago

+1 for a folder with README for each experiment

Another thing why big files do not cut it: for models to be unpicklable, nothing should be in __main__ namespace. On the other hand, for models to be picklable, everything should be in a global namespace. Therefore, each non-trivial project needs two files: one with all the components and another one, a runnable script.

So what do you guys think about __init__.py for components and run.py for running?

rodrigob commented 9 years ago

why not have humane names, and have __init__.py import the relevant files. so mnist/__init__.py would become mnist/mnist.py and mnist/__init__.py would be a single liner containing from .mnist import *. Would not that cut it ?

rizar commented 9 years ago

Honestly I do not see that would be a big improvement over the __init__.py/__main__.py breakdown we have now.