friedrichknuth / covid_dashboard

MIT License
4 stars 2 forks source link

Convert data to xarray #5

Open rsignell-usgs opened 4 years ago

rsignell-usgs commented 4 years ago

This dataset would really be nicer in xarray with "region" and "time" as coordinates.

tjcrone commented 4 years ago

@rsignell-usgs, I agree these data definitely lend themselves to Xarray, and transforming them from the DataFrame should be pretty easy. One thing I would suggest, before we do too much more data wrangling, is that we consider finding a different data source. The data we are using have numerous issues and the maintainers are not responding to PRs or user questions. The numbers on the JHU website are almost always different than the numbers in this repo. Any thoughts on where we might find cleaner/better/more up-to-date data?

rsignell-usgs commented 4 years ago

@tjcrone , I would guess that with all those folks using the data from JHU, that if there was a better source, they would know about it. I wonder if we could take things to a higher level and get more resources directed toward maintaining that site. Do we know anyone at JHU?

tjcrone commented 4 years ago

Their issue tracker is packed not only with complaints, but with suggestions for other sources and better-cleaned datasets. I just haven't had a chance to dig into it.

rsignell-usgs commented 4 years ago

@tjcrone, oh, fantastic! My bad. I just ASSUMED.

benholtzman commented 4 years ago

Hi all -- I did some slow wrangling over the weekend. To make time series for each location, i converted the df to a list of dictionaries. Kinda quick and dirty and I've never used xarray.) it's not pretty code but i'll push it in a bit- I just need to fix one thing. I'm mid-way through the project i started*, but will push it as is in case there are any useful bits in there for you. I have to work on other stuff today unfortunately. Thanks! Ben

*to estimate the doubling times in each location, and then see if the doubling times decrease as a function of time-- that could be an indication of people learning, adapting and self-isolating (regardless of what their governments say). However, there are obvious problems estimating the doubling time the way i am doing it, by fitting a growth model. For example, it wont work yet on any location that has flattened its growth curve, which appears to be all of chine and south korea. I started looking for a way to not fit those parts of the curve (second deriv threshold or something), but didnt get there yet.

On Mon, Mar 16, 2020 at 10:03 AM Rich Signell notifications@github.com wrote:

@tjcrone https://github.com/tjcrone, oh, fantastic! My bad. I just ASSUMED.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/friedrichknuth/covid_dashboard/issues/5#issuecomment-599553342, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCSZTYFOBYYI5YMYEXMNO3RHYWRTANCNFSM4LME4D5Q .