lisphilar / covid19-sir

CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
https://lisphilar.github.io/covid19-sir/
Apache License 2.0
109 stars 44 forks source link

[Question] Estimate SIR model parameters for other countries #735

Closed hannanabdul55 closed 3 years ago

hannanabdul55 commented 3 years ago

Summary of question

Hi, I am new to the usage of SIR models to forecast cases/deaths. I am trying to use your package to estimate the cases/death count in India, state-wise. Here is a notebook that I used to estimate. I was wondering if the initial parameters do matter in the correct estimation of the SIR model parameters. And are the initial parameters defaulted to some other country? And where could I possibly get these parameters for another country, like India. Any help would be greatly appreciated.

Thanks a lot in advance!

(Optional) What tried

(Optional) Environment

lisphilar commented 3 years ago

Thank you for your question!

I could not understand the word "initial parameters", but parameter values of the all past phases are estimated when we run Scenario.estimate(cs.SIRF) (snl.estimate(cs.SIRF, timeout=2000) in your notebook) to fit your data.

Or, you mean "parameter values of the all past phases" by "initial parameters"? If so, you can get them for India by changing province name to None when your create Scenario instance. i.e. snl = cs.Scenario(country="India", province=None, jhu_data=jhu_data, population_data=population_data).

hannanabdul55 commented 3 years ago

Ah okay! This answers my question about "initial parameters" (what I meant is the parameters of the past phases). I have some follow up questions:

Thank you for response!

lisphilar commented 3 years ago

Thank you, your questions will guide to new features of CovsirPhy! Do you have time to join our development team (volunteers)?

Is there a way to store the Scenario object efficiently? pickle.dump(snl) has a size of around 160 megabytes!

No ways to store it at this time. Could you share the purpose to store it? If you need to store phase information (start/end dates and parameter values), please save the output of Scenario.summary() by .to_csv(). To recover Scenario instance, use Scenario.add() method with end_date and keyword arguments for parameters, not Scenario.trend().

Is there a variable to plot incident case forecast rather than cumulative cases? If not, from what I understand is, I need to fetch the raw dataframe with the last (future) phase and calculate incidence cases using that?

"Incidence cases" mean daily new cases? We can get the dataframe with Scenario.simulate(show_figure=False), process it and show it with a figure using covsirphy.line_plot() function.

hannanabdul55 commented 3 years ago

Hi,

Do you have time to join our development team (volunteers)?

Yeah sure! I am glad to join your development team!

No ways to store it at this time. Could you share the purpose to store it?

My idea was a way to save and retrieve the Scenario object with one command. Similar to pd.DataFrame.to_csv() and pd.DataFrame.read_csv(). Currently, it's clearly inefficient as a pickle.dump of the object is massive.

The purpose of this is when I want to save the scenario parameter estimate if I run it on a server. But then subsequently I'd like to simulate different scenarios using these estimated parameters on my local machine. I see that this is possible currently, but looks like just an efficient method to do just this would be ideal!

"Incidence cases" mean daily new cases?

Yes, I mean daily new cases. Thank you for the suggestion! I can have a look at it 👍

lisphilar commented 3 years ago

Welcome to our team!! Please read Guideline of contribution and let me know if you have any questions in this issue.

save and retrieve the Scenario object

Just for issue tracking, we will move to new issue #741.

Yes, I mean daily new cases. Thank you for the suggestion! I can have a look at it 👍

This could be also a new feature of Scenario class. If you have time, please try to add Scenario.simulate_diff() method later with a new issue. (We have Scenario.records_diff() to show the raw data of daily new cases.)

Note that we are currently updating internal codes of Scenario with issue #718. We keep method names and arguments, but internal codes will be changed significantly. Please give me a few days to complete it.

hannanabdul55 commented 3 years ago

Thank you, @lisphilar!

I see that the Scenario object, on pickling, saves all the data associated too, like, for example, the entire JHU and population data for the basic case ( extras also if specified). We could avoid these data frames when pickling to make the size of the output file much smaller. And if so, we can also serialize the Scenario object to a JSON object itself.

If you have time, please try to add Scenario.simulate_diff() method later with a new issue.

Sure, I can have a look at it. I will first try understanding the code and will get back to you if I have any questions. Thanks again!