TheEconomist / covid-19-the-economist-global-excess-deaths-model

The Economist's model to estimate excess deaths to the covid-19 pandemic
https://www.economist.com/graphic-detail/coronavirus-excess-deaths-estimates
MIT License
459 stars 83 forks source link

prediction matrix not saved in output-data folder #3

Closed nmarum closed 3 years ago

nmarum commented 3 years ago

I am looking to run the 3 excess deaths model scripts. I get an error at the third script when it tries to subset the prediction matrix against the export_covariates object. I gather it is to remove the Mumbai region from the prediction matrix however the export_covariates object and the pred_matrix are of different dimensions so it throws up an error.

I have tried cleaning the environment and rerunning the second script but it continues to return the same error.

pred_matrix <- pred_matrix[export_covariates$iso3c != "IND_Mumbai", ]

Error in pred_matrix[export_covariates$iso3c != "IND_Mumbai", ] : (subscript) logical subscript too long

Thanks

sondreus commented 3 years ago

Hi, are the two objects of different dimensionality when you first load them?

I would suggest restarting R, and running the export script up to line 24 and see if you get an error and let me know if so.

Note that if you have generated a new pred_matrix yourself you must ensure that you generate a corresponding export_covariates object as well.

nmarum commented 3 years ago

Yes, the two objects had different dimensions after running the model script. Line 24 is where it stopped working. When I reverted the changes to the new pred_matrix and export_covariates RDS files, I was able to get it to run without error.

I will try running the scripts 1, 2 and 3 again to see if I can replicate the issue.

On Fri, May 28, 2021 at 2:37 PM Sondre Ulvund Solstad < @.***> wrote:

Hi, are the two objects of different dimensionality when you first load them?

I would suggest restarting R, and running the export script up to line 24 and see if you get an error and let me know if so.

Note that if you have generated a new pred_matrix yourself you must ensure that you generate a corresponding export_covariates object as well.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/TheEconomist/covid-19-the-economist-global-excess-deaths-model/issues/3#issuecomment-850597618, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP3L4XIOYKP3BFSCZS3XG7LTP7PFHANCNFSM45WNSWMQ .

nmarum commented 3 years ago

Hi - I think I figured out the issue. When I was generating a new pred matrix it was being saved in the working directory rather than the /output-data/ directory. This was creating a mismatch between my new export_covariates object in the output-data directory and the old pred_matrix that I cloned from the repo.

The problem is in the model script at line 321: saveRDS(pred_matrix, "pred_matrix.RDS")`

While the third (export) script at lines 16 to 20 reads:

Load all model prediction + 101 bootstrap pred_matrix <- readRDS("output-data/pred_matrix.RDS")

Load covariates (iso3c, country name, population ++) export_covariates <- readRDS("output-data/export_covariates.RDS")

When I moved the new pred_matrix.RDS to the output-data directory, it seemed to address the issue. I am running through the three again just to be certain, however it takes a significant amount of time to run on my laptop.

nmarum commented 3 years ago

Hi again - sorry for the multiple messages. I ran through the scripts again with line 321 modified to save the prediction matrix in the save folder as the other model output data and that seems to address the issue with the third script.

Would recommend updating the model script line 321 to the following so that others trying to replicate your analysis with updated data do not run into the same issue.

saveRDS(pred_matrix, "output-data/pred_matrix.RDS")

Thank you for the opportunity to have a closer look at some really cool work.

Nick

sondreus commented 3 years ago

Thanks Nick, that is a good idea. I have made the suggested change, which should ease model regenerations. Thanks for the suggestions.