Closed JoaoLages closed 7 years ago
If I understand correctly, the issue is that docpairs.py
and rerank.py
are using the hardcoded train_test_years
rather than config parameters. A hack that should temporarily fix the issue would be editing train_test_years
in utils/config.py
(but based on your comments, I'm guessing you already realize that). The error comes from the fact that the eval scripts are looking for the predictions in an empty directory.
Those scripts ignore the train_years
and test_year
config parameters because we generally have multiple test_year
values for each train_years
value (i.e., we use 4 years for training and predict on both of the remaining two years; one of the remaining two years is used for testing and the other for validation).
The current code hardcodes the parameter values rather than expecting the eval scripts to be run multiple times with different values for train_years
and test_year
. This isn't ideal though, and I'll talk with @khui to see if we can find a better solution.
I'm glad to see that you're uncovering issues with our pipeline. Thanks for pointing it out!
I'm glad to see that you're uncovering issues with our pipeline. Thanks for pointing it out!
You're welcome. Glad I can help such a good open source project
Thanks for pointing that out Joao. As suggested by Andrew, the train_test_years dictionary needs to be edited according to the actually training/test year being used. In your case, you may write: {‘wt09_10’:[‘wt11’]}
However, by doing this, you wont have a validation data, thereby could not be properly evaluated.
{‘wt09_10’:[‘wt11’, ‘wt12’]}
The evaluation actually relies on train/validation/test split. At training and predication phases, one train on certain years (wt09 and wt10), and predict over different iterations on certain test years (wt11 or wt12). At the evaluation phase, given the training years (wt09 and wt10), one needs to specify two predicted years, one for validation and one for testing, to conduction evaluation. For example, when evaluating on wt11 and validating on wt12, the model is selected based on wt11 and the evaluation results are based on wt12.
At this moment, I would suggest you could directly configure the train_test_years.
According to Joao, the problem is fixed after editing train_test_years.
While trying to evaluate the model using
bin/evals.sh
I get the following RuntimeWarning:Which leads to the following error:
I believe that the problem in here is due to the
train_test_years
variable set inutils/config.py
asI have trained and predicted the model with wt09_10 and wt11 respectively, which I have set on both bash scripts. Both the functions
docpairs.py
andrerank.py
don't look into thesetrain_years
andtest_year
variable passed in the config.