Closed shik-design closed 1 year ago
Thank you for your question!
Please try the next script and let me know whether this is suitable for your analysis or not. If suitable, I will update internal codes of covsirphy
.
import covsirphy as cs
eng = cs.DataEngineer()
eng.download(country=None, province=None);
eng.clean()
eng.transform()
# Get country-level data
top_df = eng.layer(geo=None, variables="SIRF").drop(["Province", "City"], axis=1)
# Fill in NAs some countries (top-level administration) have
variables = list(set(top_df.columns) - set(["ISO3", "Date"]))
pivot_df = top_df.pivot_table(values=variables, index="Date", columns="ISO3", aggfunc="last")
filled_df = pivot_df.ffill().fillna(0).stack().reset_index()
# Recreate DataEngineer() instance with the filled data
eng2 = cs.DataEngineer(layers=["ISO3"])
eng2 = eng2.register(filled_df, citation=eng.citations())
eng2.inverse_transform()
# Get data at global scale
actual_df, status, _ = eng2.subset(geo=None, variables="SIRF", complement=True)
print(status)
actual_df.tail()
We can use .subset(geo=None)
or .subset(geo=(None,))
to get data at global scale, but they just calculate total values at country level on dates. This makes some troubles because the first/last dates of records are different for countires.
It worked very well!! No issues at all. For the sake of experimentation, I think that, you can also combine the already dowloaded file in the "input" folder.
How can I do that without necessarily dowloading the files every time?
In the previous versions of covsirphy we cs.DataLoader("input", update_interval=24)
@shik-design Thank you for your confirmetion! We will execute the folowing codes after 2.27.1 release.
import covsirphy as cs
eng = cs.DataEngineer()
eng.download(
country=None, province=None,
databases=["covid19dh", "japan", "owid"], directory="input", update_interval=24);
eng.clean()
eng.transform()
actual_df, status, _ = eng.subset(geo=None, variables="SIRF", complement=True)
print(status)
actual_df.tail()
How can I do that without necessarily dowloading the files every time? In the previous versions of covsirphy we cs.DataLoader("input", update_interval=24)
Please use the keyword arguments: .download(directory="input", update_interval=24)
.
Default values are directory="input"
and update_interval=12
.
FYI.
Please use .download(databases=["covid19dh", "japan", "owid"])
.
Refer to https://github.com/lisphilar/covid19-sir/issues/1223 and https://github.com/lisphilar/covid19-sir/issues/1224
Summary of question
Good day! How can I analysis global data from
DataEngineer()
?Code
I need
geo=("Global",)
if there is something like that.I want to test the model at the global scale