asreview / automated-systematic-review-simulations

Simulations for the https://github.com/asreview/asreview
MIT License
8 stars 3 forks source link

Informative error message - overwriting output file error #8

Open SagevdBrand opened 4 years ago

SagevdBrand commented 4 years ago

Describe the bug When the results of a simulation are stored in a specific file and folder, it is not possible to overwrite the file when altering a part of the simulation command. Instead an error is returned.

When trying to run a simulation with code of Gerbrich's master thesis, I ran into the following problem. I had already run her original code a little while ago, which is why a file was already created in the directed folder with a specific name. When then trying to alter the code a bit (only adding the n_papers argument), an error occurred: TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType'.

This error was fixed by simply altering the state_file command such that the results will be saved in a different file or folder.

To Reproduce A while ago, ran the code: asreview simulate ../../datasets/sim_datasets/nudging.csv --config_file config/one/BCTD/nb_max_double_tfidf-nudging.ini --state_file simoutput/one/BCTD/nudging/results.h5 --init_seed 42

Then today tried to run: asreview simulate ../../datasets/sim_datasets/nudging.csv --config_file config/one/BCTD/nb_max_double_tfidf-nudging.ini --state_file simoutput/one/BCTD/nudging/results.h5 --init_seed 42 --n_papers 200

This resulted in the error provided in the screenshot below.

To fix, simply change the folder in which the results are stored: asreview simulate ../../datasets/sim_datasets/nudging.csv --config_file config/one/BCTD/nb_max_double_tfidf-nudging.ini --state_file simoutput/one/BCTD/nudging2/results.h5 --init_seed 42 --n_papers 200

The difference can be found in the --state_file part

Screenshots This resulted in the following output: image

Version information

Additional context Might be something to discuss in the upcoming docathon?

J535D165 commented 4 years ago

Thanks for reporting @SagevdBrand. I think this is a duplicate of https://github.com/asreview/asreview/issues/288 (@qubixes ?). This issue was resolved and will be part of release v0.10.

Sasafrass commented 4 years ago

Oh sorry, I have also run into this before during some testing. I just circumvented it with some error handling myself and raising an informative exception. What did you guys opt for in the official release, @J535D165 ?

qubixes commented 4 years ago

I think this should be resolved in the new release yes (unless it is a separate issue).

One thing to note is that asreview simulate won't overwrite any simulation files. Instead, if you're using the same file, it will try to finish the simulation, with the parameters when you started the first simulation. The reason for this behavior is so that you can suspend/cancel simulation and continue it afterwards.

The problem with old code was that the n_prior_included/n_prior_excluded were not stored in the state file. That's why the continuation of old save files <0.10 (>0.x?) won't work.

J535D165 commented 4 years ago

The problem with old code was that the n_prior_included/n_prior_excluded were not stored in the state file. That's why the continuation of old save files <0.10 (>0.x?) won't work.

Can we resolve this by editing this line

if len(start_idx) == 0 and n_prior_included + n_prior_excluded > 0:

into

if len(start_idx) == 0 and n_prior_included and n_prior_excluded and  n_prior_included + n_prior_excluded > 0:

?

qubixes commented 4 years ago

I don't think so, no. I mean, it might work, but might also result in unexpected behavior. So I would prefer throwing an exception. I think a "proper" solution probably needs some rewriting of the API/factory. (I.e. absorbing functionality from the factory into BaseReview/ReviewSimulate).

J535D165 commented 4 years ago

Is this one resolved?

It would be nice to have some warning to indicate that this is happening. @qubixes can you do a PR one this?

qubixes commented 4 years ago

I don't think there's an easy solution. Rewriting the review factory is the best solution if you want to make this happen (imo).