IATI / IATI-Dashboard

A dashboard of various metrics, generated nightly from IATI data
http://dashboard.iatistandard.org
15 stars 20 forks source link

Error proof reading CSV for plot generation #596

Closed akmiller01 closed 2 years ago

andylolz commented 2 years ago

Perhaps I am misunderstanding, but this doesn’t look like it will have any effect?

akmiller01 commented 2 years ago

Hi @andylolz, the clone of the gist repository was in a somewhat broken state, and was producing a header on history.csv that was tab separated, where the subsequent rows were comma separated. So row[1] was throwing an IndexError due to the header being parsed as a single cell. This could have also been error-proofed by catching IndexErrors, but I thought checking the row length would prevent future index errors and throw out any headers in one step:

image

andylolz commented 2 years ago

Oh I seeee! Apologies yes, that does make sense! (I saw len(rows) > 1 and thought len(rows) > 0!)

That first row looks very weird though, I don’t know how or why that would happen. If the gist clone is corrupted, maybe it’s worth doing a rm -rf f117c9be138aa94c9762d57affc51a64? I just tried running the relevant bit of code (shown below) and it worked correctly and generated a nice history.csv without that weird header row. https://github.com/IATI/IATI-Dashboard/blob/eb56efbf7dbda4b24ac97a7799d8d1f2292dd497/fetch_data.sh#L13-L25

andylolz commented 2 years ago

In fact, the number of errors shown in your screenshot (449) isn’t correct at that timestamp (2021-09-28 16:39:19 +0100).

That’s commit b8e5f161, so I think it should say 245 errors.

Definitely seems like you are correct – the local gist repo is broken. Re-cloning should hopefully sort it out.

akmiller01 commented 2 years ago

It's definitely an odd one. I've looked through the full commit history of fetch_data.sh as well as the gists, and at no point would anything produce a header. So the only conclusion I can draw is that at some point someone manually created a dummy history.csv file and put it in place on production. I've done a git reset --hard origin/master on the gist, and brought it back from a detached head state, so we can see if that rectifies the problem.

akmiller01 commented 2 years ago

Yes that was sufficient. This code isn't necessary when the CSV is generated correctly:

$ head history.csv 
2022-08-23 20:19:59 +0100,511
2022-08-20 03:54:39 +0100,512
2022-08-19 20:20:27 +0100,510
2022-08-19 17:18:58 +0100,509
2022-08-19 14:26:42 +0100,513
2022-08-19 11:09:24 +0100,510
2022-08-19 08:20:34 +0100,514
2022-08-19 05:56:55 +0100,512
2022-08-19 04:04:26 +0100,513
2022-08-18 23:11:54 +0100,511
andylolz commented 2 years ago

the only conclusion I can draw is that at some point someone manually created a dummy history.csv file and put it in place on production

Hmm – but the contents of history.csv is overwritten every time fetch_data.sh is run. So I don’t think that can be the reason.

Weird one, but good it’s now sorted!