aatishb / covidtrends

Tracking the growth of COVID-19 Cases worldwide
https://aatishb.com/covidtrends/
MIT License
301 stars 107 forks source link

Incorrect date for some users #131

Closed reaganch closed 4 years ago

reaganch commented 4 years ago

The animation seems to stop at the date 2020-04-21. Any idea why data for the last couple of days aren't showing up?

aatishb commented 4 years ago

Hi, I'm not seeing this on my end. Could you try refreshing the page? Also what browser are you using? Thanks.

alekhe commented 4 years ago

Same here. Only until 2020-04-22. Tested in Chrome and Edge. image

reaganch commented 4 years ago

Just tried on Firefox and Chrome. Clearing browsing cache and refreshing doesn't help. As @alekhe mentioned though, the animation now goes until 2020-04-22.

aatishb commented 4 years ago

Thanks @reaganch @alekhe. Are there any errors displayed if you pull up the JavaScript console?

Looks like this could be a bug. I'm at a bit of a loss for what might be causing this, my best guess is some kind of caching issue? Have flagged this so more folks can take a look. Would be good to know how we can reproduce this.

reaganch commented 4 years ago

image

Is this what you asked for? Sorry, I'm not very familiar with these features of Firefox.

aatishb commented 4 years ago

@reaganch yes, that's it, thanks. I don't see any errors on there, other than the usual warnings.

rpkoller commented 4 years ago

isn't that an issue with the data source perhaps? if i download https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv and take a look at the first line:

Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,2/22/20,2/23/20,2/24/20,2/25/20,2/26/20,2/27/20,2/28/20,2/29/20,3/1/20,3/2/20,3/3/20,3/4/20,3/5/20,3/6/20,3/7/20,3/8/20,3/9/20,3/10/20,3/11/20,3/12/20,3/13/20,3/14/20,3/15/20,3/16/20,3/17/20,3/18/20,3/19/20,3/20/20,3/21/20,3/22/20,3/23/20,3/24/20,3/25/20,3/26/20,3/27/20,3/28/20,3/29/20,3/30/20,3/31/20,4/1/20,4/2/20,4/3/20,4/4/20,4/5/20,4/6/20,4/7/20,4/8/20,4/9/20,4/10/20,4/11/20,4/12/20,4/13/20,4/14/20,4/15/20,4/16/20,4/17/20,4/18/20,4/19/20,4/20/20,4/21/20,4/22/20,4/23/20

then data is only available till the 23rd of april now. so maybe it isnt a problem with js and covidtrends and more of an issue with the JHU data set and reporting ?

aatishb commented 4 years ago

To me it sounded like the issue was they were pulling a version of the data that was about a day older than the most recent. Also received a message from another person running into what seems to be the same issue.

To clarify: while we are all seeing data 1 day behind the current date, I think some folks may be seeing data from 2 days prior.

rpkoller commented 4 years ago

Since the last few days data was a bit behind from Johns Hopkins I've rechecked today and:

Bildschirmfoto 2020-04-28 um 10 40 57

So the data set has numbers for the 27th available as well already but Safari 13.1 on MacOS 10.13.6 shows only until the 26th. So reproducible for me as well now (never noticed before always thought data coming in with just a bit of delay). :/ Since it is one day off maybe some sort of array miscount?

aatishb commented 4 years ago

Hmm.. interesting. Thanks for sharing this. I doubt it's an array miscount as only some users are seeing this, so I suspect that it may be a caching issue instead.

rpkoller commented 4 years ago

@aatishb hmmm i guess it is not necessary or entirely a caching issue. i've rechecked the curve for germany right now:

Bildschirmfoto 2020-05-01 um 03 35 40

now i went to the github repo of johns hopkins university: https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

scrolled to the latest value in the row for germany:

Bildschirmfoto 2020-05-01 um 03 35 54

and then went up checking what the latest rows header is:

Bildschirmfoto 2020-05-01 um 03 36 00

so in the pane the correct and latest available value in the csv file is shown with 161539. in the csv file that is the value for the 29th of april. but covidtrends label show 28th of april as the latest date. but on that date the value in the csv would be the previous and that is 159912

same e.g. for brazil: Bildschirmfoto 2020-05-01 um 03 42 20 Bildschirmfoto 2020-05-01 um 03 42 29

aatishb commented 4 years ago

@rpkoller Thanks for this. Great that you are able to reproduce this.

Could you try adding a random string to the urlPostFix variable i.e. append some random characters to the end of line 323 here.

so e.g. change let urlPostFix = '?' + Math.floor(Date.now() / (8 * 60 * 60 * 1000)); to let urlPostFix = '?' + Math.floor(Date.now() / (8 * 60 * 60 * 1000)) + 'randomCharsHere';

Since the urlpostfix only updates every 8 hours, I'm wondering if perhaps you are pulling a previous cache from before the recent data update?

aatishb commented 4 years ago

Also, when you load this URL, what is the last date you see in the header?

https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

This is the URL we pull data from. I'm seeing up to 4/29 currently.

Screen Shot 2020-04-30 at 9 57 52 PM

Finally, if you don't see 4/29, could you try something to break the cache like https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv?1234 and let me know if that changes anything?

rpkoller commented 4 years ago

@aatishb you mean run the site locally and adjust the string for pulling the csv?

aatishb commented 4 years ago

Sorry @rpkoller, I'm re-reading your comment more carefully. I had missed the main point and thought you were running the url-caching branch locally. So the data is correct but the date is wrong. In that case I agree this is probably not a caching issue.

Can you confirm that you see up to 4/29 on this link?

rpkoller commented 4 years ago

Bildschirmfoto 2020-05-01 um 04 07 46 aka i am able to confirm... and sorry i havent explicitly mentioned in my comment that i tried it on https://aatishb.com/covidtrends/ and not locally... its already late over here...

aatishb commented 4 years ago

Thanks. For the sake of comparison, here's what I see.

Screen Shot 2020-04-30 at 10 08 20 PM Screen Shot 2020-04-30 at 10 08 55 PM

To summarize: we're seeing the same recent data with different dates on the graph (off by one day).

Hmm.. this is making me wonder if this might be a time zone issue.

rpkoller commented 4 years ago

uhhhhhh WOW that is indeed wild... also tricky to figure out the root of the problem initially... but why time zones? i thought the labels just refer to the dates from the row headers?

aatishb commented 4 years ago

@rpkoller thanks for your help in debugging this. Time zones is just a guess based on the fact that this issue has shown up for people in other time zones (Europe, India).

Here is the line that creates the date on the title. The dates are passed through the formatDate function to internationalize the date format to yyyy-mm-dd.

So that might be the first place for us to investigate. cc @edg2s

I'll investigate further but the soonest I can take a close look is over the weekend.

rpkoller commented 4 years ago

ahhhhhh didn't know that the data is getting passed through a formatDate function. Just thought the content of the csv would be simply parsed building the curves and their labels. that way your suspicion might be a possibility. maybe i'll take a look tomorrow as well. but now it is sleep-time. 4:30 am already ;)

aatishb commented 4 years ago

Possibly related: https://stackoverflow.com/questions/2488313/javascripts-getdate-returns-wrong-date https://stackoverflow.com/questions/7556591/is-the-javascript-date-object-always-one-day-off

aatishb commented 4 years ago

I just tried typing new Date("2020-04-29") into my browser console, and received the output Date Tue Apr 28 2020 20:00:00 GMT-0400 (Eastern Daylight Time)

Note the different date!

And new Date(2020,4,29) gives Date Fri May 29 2020 00:00:00 GMT-0400 (Eastern Daylight Time) which is the right date but unexpected month -- it looks like months are zero indexed.

Confusing!

rpkoller commented 4 years ago

hahaha see that was the offset i meant. a value one off is always an indication of an array miscount. like the starting with either zero or one ;) and indeed confusing as hell. timezones is a head scratcher on its own already ;)))

rpkoller commented 4 years ago

Bildschirmfoto 2020-05-01 um 05 29 28 Bildschirmfoto 2020-05-01 um 05 29 00 that was a local test...

edg2s commented 4 years ago

My original patch didn't use toISOString, but that always outputs UTC. The easiest fix would be to use Date.UTC:

new Date(2020,04,01).toISOString().slice(0,10)
> "2020-04-30"
new Date(Date.UTC(2020,04,01)).toISOString().slice(0,10)
> "2020-05-01"

Living in a GMT country makes these bugs harder to spot too :)

aatishb commented 4 years ago

Fixed! @all-contributors add @reaganch for bug

allcontributors[bot] commented 4 years ago

@aatishb

I've put up a pull request to add @reaganch! :tada:

reaganch commented 4 years ago

Woot! Glad you were able to get to the bottom of this.