Closed TimDettmers closed 6 years ago
See also the make list
command to list all models. This is more immediate so that users can use the same procedure, make file calls, to (1) find out what they can run, (2) do data download and preprocessing, (3) run simple models. I think this would be a good way to go.
Merging #309 into master will not change coverage. The diff coverage is
n/a
.
@@ Coverage Diff @@
## master #309 +/- ##
=======================================
Coverage 16.73% 16.73%
=======================================
Files 72 72
Lines 4254 4254
=======================================
Hits 712 712
Misses 3542 3542
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 40fbde5...f303853. Read the comment docs.
@TimDettmers I like this PR after addressing the minor issues I mentioned and fixing some of the problems you had (e.g., fix download scripts). Is there anything else that is blocking this PR? I think we should get that in soon.
I updated readme and the download scripts, should be better now
If we want to use the Makefile
as "documentation" is may make sense to include both types of invocation, that is via config and via command line parameters. What do you think?
I will address the changes later today or tomorrow.
Whatever you think is better. Personally, I think the using task specific configs with minor overwrites on the command line are nice. That means that the task config gives you a reasonable training setup (hyperparameters) such that you only need to define the model, model_dir etc on the command line.
Hmm, that is true. I like the config + overwrite better. However, we would need to work on something that makes clear to the user what parameters are available and what is their syntax/name. Some models have different parameter options than others. How do we generalize here?
Maybe its better to just list all of them for a start? Just --help parameters
or something, and we also have --help models
and --help datasets
and when you type --help
it just returns a list of these commands, what do you think?
Is this good to merge? Maybe we can add a "config introspection" issue for 0.2?
@undwie this is not yet ready, see my comments (although partially outdated) above.
I fixed the SNLI commands now, but the SQuAD commands run on errors even after rebase. I think some embedding config value is missing, but this is not clear from the error and not very user-friendly. I also tried the command that you proposed and it also does not because my memory blew up (I think this is GloVe, is it?). Can you fix this @dirkweissenborn?
@undwie we should fix these issues before a release, otherwise people cannot really use jack. If someone that worked on jack (me) cannot use it after 10 minutes of trying to figure out how to run something without a config then this is a pretty bad design. Also, the first release is crucial for long-term success.
I do not think Jack is ready for a release without a fix for these user experience problems.
Note that 0.1 wasn't supposed to be the "let's get it out there release" but a pre-step to that. I am mentioning this because we run the risk of constantly moving goal posts.
But yes, what you say generally true! My point was about ideas like "listing all configuration items automatically" etc. which are great ideas, but I think good examples/readme's can deal with this for the time being. I managed to run FastQA
relatively easily by just looking at the configuration files and readme. Maybe that's because I am okay with using config files (if I am explicitly said to use these).
In any case, we should improve usability dramatically, but maybe not all in 0.1? In fact, my wish for 0.2 would be to mostly focus on usability improvements!
This introduces a draft for download data and preprocessing files with make commands. It would be easy for users to use this, and it would be easy for users to extend this. Currently, it is a mess and it takes ages to understand how to run a simple model. I know Jack and even for me it took ages. This is not good. Let us also discuss different options how to make this better.
Currently the make command for SQuAD is broken, because the processing is not done correctly; see #308.