Closed elmedianikhadija closed 2 years ago
Hello @elmedianikhadija ,
Did you try DataLoader.read_csv()
and DataLoader.read_dataframe()
?
https://lisphilar.github.io/covid19-sir/markdown/LOADING.html
I am using the functionality of DataLoader.read_csv(). If this procedure is followed there is no cleaning function that can be used on the pandas' object. And there is no inclusion of the infected column in data.load phase and there is a problem using the ExampleData(). function
Dear @geeky-programer ,
Please use df = pd.read_csv(); (data cleaning); DataLoader().read_dataframe(df)
at this time.
Note that JHUData
class (parent class of ExampleData
) calculates Infected = Confirmed - Recovered - Fatal automatically and internally. Could you provide me with the details of the problem?
With #1064, I plan to create DataEngineer
class, which handles all of data reading, data cleaning and calculation of Infected.
The output of both of these df = pd.read_csv(); (data cleaning); DataLoader().read_dataframe(df),Does not contain the Infected column and the data is not cleaned properly.
I have followed the steps as described in https://lisphilar.github.io/covid19-sir/markdown/LOADING.html and the steps for loading from CSV files. There maybe a bug, please check the workflow.
Are there any other functions or classes I can make use of to create good data to feed the models at this point in time?
Thank you very much for your response. And great effort on the compartmental models.
Thank you for your response and could you share the codes and CSV file (or some lines with the column names of the data)?
`import pandas as pd pip install --upgrade "git+https://github.com/lisphilar/covid19-sir.git#egg=covsirphy" import covsirphy as cs'
N_germany = 83240000 URL = 'https://gitlab.uni-koblenz.de/akshaygs/ann/-/raw/main/Data%20set/SIR_data.csv'
df_new = pd.read_csv(URL,delimiter = ";",index_col=False)
df_new['Confirmed'] = df_new['Infected'] + df_new['Deaths'] + df_new['Recoveries'] df_new['Susceptible'] = N_germany - df_new['Confirmed']
df_new['Recovered-new'] = df_new['Deaths'] + df_new['Recoveries'] df_new['Population'] = int(83240000) df_new.drop("Entry", axis=1, inplace=True) df_new.drop("Recoveries", axis = 1, inplace = True)
loader = cs.DataLoader(update_interval = None)
loader.read_dataframe(df_new, parse_dates = ["Date"],dayfirst = "25Feb2020")
loader.assign( country="Germany", province = "Germany" ) new_data = loader.lock(
date="Date", country="country", province="province",confirmed="Confirmed", fatal="Deaths", population="Population",
#date="Date",confirmed="Confirmed", fatal="Deaths", population="population",
#Optional
recovered="Recovered-new",
) data = loader.locked
cis_data = cs.ExampleData(data,tau=1440, start_date="25Feb2020" )
Thank you for the details!
Just to confirm, did you try DataLoader().jhu()
instead of ExampleData(data)
?
Because you have actual records of COVID-19 and an instance of JHUData
returned by DataLoader().jhu()
calculates Infected internally.
Yes, I have tried the DataLoader().jhu(). The data gathered is quite impressive. I am a data science student, so I am loading different data for academic purposes. Can I know when you plan on implementing DataEngineer class?
With #1090, I'm writing some new classes, including DataEngineer
. I didn't have much time to update the pull request these days, but I plan to merge it this week or next Saturday/Sunday.
Ok. Much appreciate the work. Thank you
Dear @geeky-programer ,
Sorry for the delay, but I'm preparing documentations of the class DataEngineer
(available with only development versions at this time). After completiong of writing notebooks, I will release the next stable version and update documentation (GitHub page).
hello, please, i want to use new database for my analysis, how i can do it ?