Open dwf opened 9 years ago
I guess we can mention it in the README. I'm not sure what the best way of packaging the scripts is actually. I named them __init__.py
so that they become packages that are importable (important for testing), and because the alternative is to have a very redundant reverse_words/reverse_words.py
, which would mean importing from reverse_words.reverse_words
.
A related issue I ran into: Because of Python 3's import change, the line from .dataset import MarkovChainDataset
doesn't work if you're trying to execute markov_chain/__init__.py
(or whatever file in that directory) as a script, because you can only do relative imports in packages. This mean that the script now needs to be run with python -m markov_chain
instead of python markov_chain/__init__.py
.
For the stuff that's self-contained in one file, why not just foo.py, no folders? And for the stuff that isn't, can it be? Or else can we could include a "run.py" script with absolute imports in it, making clear that the folder needs to be on your python path.
Folders mostly because I figured we could add a README file to each with some details of what the model does, setup/commits needed to make it run, etc. GitHub nicely renders these README files then. Many scripts also produce saved models and such, and this way they are saved to a specific folder as well, so that the top folder doesn't become a mess (or different examples start overwriting each other's models).
Most of the examples we currently have could potentially be put in a single file, but there will probably be many that can't e.g. the machine translation code is supposed to be added soon, and that's way too big to put in a single file.
Requiring the folder to be put on the path with run.py
is an option, but then I actually prefer the package approach i.e. adding a __main__.py
so that people can just run python -m example
. That seems slightly cleaner.
+1 for a folder with README for each experiment
Another thing why big files do not cut it: for models to be unpicklable, nothing should be in __main__
namespace. On the other hand, for models to be picklable, everything should be in a global namespace. Therefore, each non-trivial project needs two files: one with all the components and another one, a runnable script.
So what do you guys think about __init__.py
for components and run.py
for running?
why not have humane names, and have __init__.py
import the relevant files.
so mnist/__init__.py
would become mnist/mnist.py
and mnist/__init__.py
would be a single liner containing from .mnist import *
. Would not that cut it ?
Honestly I do not see that would be a big improvement over the __init__.py
/__main__.py
breakdown we have now.
I'm worried that people will be confused by all the example scripts being called
__init__.py
, and think this is required or something. Maybe we should mention this in the readme or something?