lisphilar / covid19-sir

CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
https://lisphilar.github.io/covid19-sir/
Apache License 2.0
109 stars 44 forks source link

[Discuss] New method data loader throw out error on the data #925

Closed subi10 closed 2 years ago

subi10 commented 3 years ago

Hi lisphilar,

Thank you for fixing the last issue on loading the dataset from local. I have updated to the latest version,h penang.csv owever the new method using data loader gives an error on the count. please see the images attached. It took only the dataset start from August. Also, the method estimate does not work. Please check the file attached too for the dataset on penang. Btw Malaysian government has released the github pages on its data for each state from this link. https://github.com/MoH-Malaysia/covid19-public

using this data, i take one for penang to test.

cap1 cap2 cap3 cap4

Thank you very much again for all your attention and great work.

lisphilar commented 3 years ago

Memo: CovsirPhy version 2.22.0

lisphilar commented 3 years ago

@subi10 , Thank you for the notice! I just have tried the codes with the dataset you shared. https://gist.github.com/lisphilar/af9db2ec8a2c84ed5081b5e3e42192e8

We have two points to discuss here.

  1. parsing error of dates Please turn on dayfirst argument with loader.read_csv("penang.csv", parse_dates=["date"], dayfirst=True) because the dates are registered with D/M/YYYY format.
  2. IndexError: index 0 is out of bounds for axis 0 with size 0 error when parameter estimation This error was caused with snl.estimate(cs.SIRF) with unknown reasons. I will try to fix it later.
subi10 commented 3 years ago

@subi10 , Thank you for the notice! I just have tried the codes with the dataset you shared. https://gist.github.com/lisphilar/af9db2ec8a2c84ed5081b5e3e42192e8

We have two points to discuss here.

  1. parsing error of dates Please turn on dayfirst argument with loader.read_csv("penang.csv", parse_dates=["date"], dayfirst=True) because the dates are registered with D/M/YYYY format.
  2. IndexError: index 0 is out of bounds for axis 0 with size 0 error when parameter estimation This error was caused with snl.estimate(cs.SIRF) with unknown reasons. I will try to fix it later.
subi10 commented 3 years ago

Sory, accidentally clicked the button "closed" :). Thank you for the great response as always. Yes the first part is okay now i set it to "True". Thank you very much for this. As for the method estimate, it shows error as u mentioned too. Thank you again for trying to fix it. :)

lisphilar commented 3 years ago

Thank you for your reply. Actually, I have short time for this project lately due to busy schedule of work and I would apprecate if you could investigate on the IndexError. Does the error occur when we use another datasets?

With a quick review of the error statement, I found that the error was caused at the following line.

covsirphy/ode/sirf.py:

df = df.loc[(df["Susceptible"] > 0) & (df["Infected"] > 0)]
--> 319         n = df.loc[df.index[0], ["Susceptible", "Infected", "Fatal", "Recovered"]].sum()

Do the records of Penang (i.e. output dataframe of snl.records()) include Susceptible <= 0 or Infected <= 0?

subi10 commented 3 years ago

Really appreciate your time to check this, Thank you again so much for the quick response as always, yes I have check it with the previous Selangor df and it has no error , upon checking the snl records keeps the fatal 0 in the record, however it didnt include anything <= 0 for infected.

image image

lisphilar commented 3 years ago

Sorry for my late reply and thank you for your investigation. From penang.csv, I extract the raw records from 8/4/2020 to 12/4/2020 and calculated infected as confirmed - recovered fatal.

date state confirmed recovered fatal infected
8/4/2020 Pulau Pinang 108 131 4 -27
9/4/2020 Pulau Pinang 108 138 4 -34
10/4/2020 Pulau Pinang 109 153 4 -48
11/4/2020 Pulau Pinang 114 166 5 -57
12/4/2020 Pulau Pinang 116 166 6 -56

Infected seems negative because confirmed << recovered. Are they correct values?

lisphilar commented 3 years ago

I found the link https://github.com/MoH-Malaysia/covid19-public in your first comment of this issue and trying with the dataset. It is not working yet, but I will share the progress. Please check if my understanding of the dataset is correct. https://gist.github.com/lisphilar/b285e9d64b5d96c68f16068798fa87c1

subi10 commented 3 years ago

Hi Good Morning Hirokazu Takaya,

Thank you so much again and again for taking the time, yes as I am checking through there are some discrepancies with the "recovered case" and this is coming from the github page in our MOH. 

Again thank you so much for combining all the dataset and I found your understanding is correct. The description is correct also. Thank you so much for this, I am really hope that there is a way to fix the "negative infected case part".  Thank you so much again. Best regards and love from Malaysia,Subhi On Wednesday, September 8, 2021, 11:54:38 PM GMT+8, Hirokazu Takaya @.***> wrote:

I found the link https://github.com/MoH-Malaysia/covid19-public in your first comment of this issue and trying with the dataset. It is not working yet, but I will share the progress. Please check if my understanding of the dataset is correct. https://gist.github.com/lisphilar/b285e9d64b5d96c68f16068798fa87c1

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

lisphilar commented 3 years ago

Hello @subi10 , Thank you for your confirmation.

Is it possible to share why "confirmed < fatal + recovered" was found in your MOH raw dataset? I could not conclude that it is OK to change the values of your dataset (e.g. set confirmed value as "fatal + recovered") in CovsirPhy side without error messages. This may have a impact on reliability of the analysis outputs with your data.

I know the error message IndexError is incorrect and this should be replaced with ValueError: confirmed < fatal + recovered found in some records for example in CovisirPhy side.

subi10 commented 2 years ago

Good day Dear Hirozaku Takaya, I am sorry to come back a little bit late, I have check again the IPYNB file you shared earlier, It seems that the recovered case also need to be cumsum, as the one shown in the MOH dataset is the daily reported one, It is weird because, upon checking further also, I also found out that some days, the recovered case is not reported in the dataset. (this is coming from our MOH)

One thing I would like to ask is when we check here, there are no infected less than 0

However, when we check after passing into .record method, we can see this and i believe this cause the error, may i ask you what is this meaning?

Thank you again and again for all your attention, I am really thankful and appreciate all your effort.  Best regards,Subhi On Thursday, September 9, 2021, 11:48:30 PM GMT+8, Hirokazu Takaya @.***> wrote:

Hello @subi10 , Thank you for your confirmation.

Is it possible to share why "confirmed < fatal + recovered" was found in your MOH raw dataset? I could not conclude that it is OK to change the values of your dataset (e.g. set confirmed value as "fatal + recovered") in CovsirPhy side without error messages. This may have a impact on reliability of the analysis outputs with your data.

I know the error message IndexError is incorrect and this should be replaced with ValueError: confirmed < fatal + recovered found in some records for example in CovisirPhy side.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

subi10 commented 2 years ago

Also, to add to this, once I cumsum the recovered, and put into the data, this is the real data for penang on that particular date (assume this to be true from MOH) and side by side comparison after loading into the package. 

Im still getting the negative value from the snl.record method. Thanks again Hirozaku. Best regards,Subhi

On Thursday, September 16, 2021, 02:35:16 PM GMT+8, Yahoo ***@***.***> wrote:  

Good day Dear Hirozaku Takaya, I am sorry to come back a little bit late, I have check again the IPYNB file you shared earlier, It seems that the recovered case also need to be cumsum, as the one shown in the MOH dataset is the daily reported one, It is weird because, upon checking further also, I also found out that some days, the recovered case is not reported in the dataset. (this is coming from our MOH)

One thing I would like to ask is when we check here, there are no infected less than 0

However, when we check after passing into .record method, we can see this and i believe this cause the error, may i ask you what is this meaning?

Thank you again and again for all your attention, I am really thankful and appreciate all your effort.  Best regards,Subhi On Thursday, September 9, 2021, 11:48:30 PM GMT+8, Hirokazu Takaya @.***> wrote:

Hello @subi10 , Thank you for your confirmation.

Is it possible to share why "confirmed < fatal + recovered" was found in your MOH raw dataset? I could not conclude that it is OK to change the values of your dataset (e.g. set confirmed value as "fatal + recovered") in CovsirPhy side without error messages. This may have a impact on reliability of the analysis outputs with your data.

I know the error message IndexError is incorrect and this should be replaced with ValueError: confirmed < fatal + recovered found in some records for example in CovisirPhy side.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

lisphilar commented 2 years ago

Note for the following.

One thing I would like to ask is when we check here, there are no infected less than 0. However, when we check after passing into .record method, we can see this and i believe this cause the error, may i ask you what is this meaning?

covsirphy perform complement automatically if the causes are clear as shown in the fig title of Scenario.records(). Please refer to the next documentation. https://lisphilar.github.io/covid19-sir/usage_dataset.html#Complement

To skip this complement, we can do snl.complement_reverse() before snl.records().

subi10 commented 2 years ago

Dear Hirokazu Takaya, Thank you for the advice and I have followed it and when I run the method estimate I get error as below

Am I doing it correctly? Thank you so much for your advice again!. Best regardsSubhi

On Thursday, September 16, 2021, 10:45:54 PM GMT+8, Hirokazu Takaya ***@***.***> wrote:  

Note for the following.

One thing I would like to ask is when we check here, there are no infected less than 0. However, when we check after passing into .record method, we can see this and i believe this cause the error, may i ask you what is this meaning?

covsirphy perform complement automatically if the causes are clear as shown in the fig title of Scenario.records(). Please refer to the next documentation. https://lisphilar.github.io/covid19-sir/usage_dataset.html#Complement

To skip this complement, we can do snl.complement_reverse() before snl.records().

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

subi10 commented 2 years ago

Hi,  Also, may i inquire the meaning of this "CIFR"? what is the method do? Thank you in advance. Cheers,Subhi

On Thursday, September 16, 2021, 11:34:19 PM GMT+8, Yahoo ***@***.***> wrote:  

Dear Hirokazu Takaya, Thank you for the advice and I have followed it and when I run the method estimate I get error as below

Am I doing it correctly? Thank you so much for your advice again!. Best regardsSubhi

On Thursday, September 16, 2021, 10:45:54 PM GMT+8, Hirokazu Takaya ***@***.***> wrote:  

Note for the following.

One thing I would like to ask is when we check here, there are no infected less than 0. However, when we check after passing into .record method, we can see this and i believe this cause the error, may i ask you what is this meaning?

covsirphy perform complement automatically if the causes are clear as shown in the fig title of Scenario.records(). Please refer to the next documentation. https://lisphilar.github.io/covid19-sir/usage_dataset.html#Complement

To skip this complement, we can do snl.complement_reverse() before snl.records().

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

lisphilar commented 2 years ago

Sorry for being absent for about 20 days. These days I'm busy with my business and moving and I would appreciate if you could try this project as a developer.

variables="CIFR" is the same as variables=["Confirmed", "Infected", "Fatal", "Recovered"]. This is just an abbr for convenience.

subi10 commented 2 years ago

Dear Hirokazu Takaya, Thank you very much for getting back to me. I really appreciate all your responses despite the limitation of your time. Yes, I would love to although there are a lot of things I need to understand,  appreciate if you can point me to the right direction. Also regarding the error, I am getting when running the "estimate" method. May i understand what could be the reason? Please find the attached notebook for the full code.  Again, thank you very much for all your time and effort.! Really appreciate it! best regards,Subi

On Tuesday, October 5, 2021, 05:26:49 PM GMT+8, Hirokazu Takaya ***@***.***> wrote:  

Sorry for being absent for about 20 days. These days I'm busy with my business and moving and I would appreciate if you could try this project as a developer.

variables="CIFR" is the same as variables=["Confirmed", "Infected", "Fatal", "Recovered"]. This is just an abbr for convenience.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

lisphilar commented 2 years ago

CovsirPhy has many source codes and I will show the lines of Python classes and methods regarding the error one by one. For exaple, "estimate" method is in 759-752 lines of /analysis/scenario.py. All methods have docstrings and they may be helpful to understand the roles.

I could not found the notebook here and could you try to attach it on your browser?

lisphilar commented 2 years ago

Let me know if there’s anything I can do.

lisphilar commented 2 years ago

This will be closed at this time and we can reopen if necessary.