octopicorn / covid19-charts

Make your own chart of the covid-19 pandemic, comparing timeseries for any countries, states, and US counties.
https://valis.pub
35 stars 6 forks source link

Heads up: Data corruption? #71

Closed otheus closed 4 years ago

otheus commented 4 years ago

All US-values for 4/29 have been corrupted. I cannot tell if this is a multiplier, a constant or what. It completely makes any graph meaningless. Maybe it's a "0" and it makes "new cases" and per-100k graphs look ridiculous.

You might need to fix this in code: if data is missing, the data gets skipped.

octopicorn commented 4 years ago

@otheus cannot confirm your issue. data for 4/29 (US, 1039909 confirmed, 60697 deaths) looks fine to me on website if you're looking at the website, please do a hard refresh. i had to scrap the cache buster recently and it may be older code? if you're running locally, please share a screenshot

otheus commented 4 years ago

@octopicorn Look at any state or county in the US. Values from T-7? are multiplied by -10. I thought problem was also in US too, but maybe .. idk. Verified in separate browser, incognito mode, chrome/guest profile. Today the "Bad" day is 4/30. screenshot: https://imgur.com/iZGtkXU I can't figure out or suggest if it's programmatic on your end or source/upstream. Very patterned behavior but also selective.

octopicorn commented 4 years ago

@otheus would have been helpful to specify in your bug report: "there is an issue when looking at New in Past X Days when it's set to 1` and country is US". your initial comment said all values for US were bad in general, thank you for clarifying

yes, this is a known issue. there is a bug calculating "New in Past X Days".

To get number for past 1 day, you need to enter 2. To get new for past 2 days, you have to enter 3. It is an n+1 type of bug, having to do with the loop used to calculate this. if it makes you worried the number might be off, you can always verify by switching over to normal view and see that the delta between day x and y are correct.

Thanks for reporting , thought this was fixed. For now just use the workaround to add +1 to the number of days you actually want.

bug is located here in code: https://github.com/octopicorn/covid19-charts/blob/master/public/js/main.js#L389

otheus commented 4 years ago

@octopicorn You're still not seeing the problem! The data is bad in EVERY view --- just more noticeable in that one.
2020-05-06 09_07_05-covid-19 charts 2020-05-06 09_06_56-covid-19 charts 2020-05-06 09_06_35-covid-19 charts 2020-05-06 09_06_24-covid-19 charts 2020-05-06 03_27_49-covid-19 charts

otheus commented 4 years ago

2020-05-06 09_06_12-covid-19 charts Also here. MIND THE GAP!

octopicorn commented 4 years ago

Ok. I see it now. The problem is not with US the country but with individual states. Looks like 5/1 data went missing. Will rectify this, checking out JH data source Thank you.

On May 6, 2020, at 00:08, Otheus notifications@github.com wrote:

 @octopicorn You're still not seeing the problem! The data is bad in EVERY view --- just more noticeable in that one.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

octopicorn commented 4 years ago

@otheus found the issue. for a couple of days the data files downloaded were just a few bytes and contained only the test 404 Not Found this means the issue was caused by the server deciding it's time to download the new day's data, before it is even ready yet. Will wrap some error handling around that piece in server.js to avoid saving any file if a 404 returned many thanks for your help.

if this happens when running locally, just delete any tiny files in the dir ./downloads/daily-reports

Screen Shot 2020-05-06 at 12 06 41 PM
otheus commented 4 years ago

Thanks for checking! Your tool is incredibly valuable and I'm trying to raise awareness of its utility.

Any chance this also happened with Spain on 4/23? France on 4/21 and 4/28? Japan on 4/27. Italy on 3/11.

octopicorn commented 4 years ago

I will check

On May 11, 2020, at 04:33, Otheus notifications@github.com wrote:

 Thanks for checking! Your tool is incredibly valuable and I'm trying to raise awareness of its utility.

Any chance this also happened with Spain on 4/23? France on 4/21 and 4/28? Japan on 4/27. Italy on 3/11.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.