enram / data-repository

Data quality assessment
https://enram.github.io/data-repository/
MIT License
3 stars 1 forks source link

Indicate when data was last added for country/radar #48

Closed peterdesmet closed 6 years ago

peterdesmet commented 6 years ago

By @adokter:

@stijnvanhoey @peterdesmet Since a few days the data seem to flowing in again, maybe something broke in Sweden which was solved when people got back from Christmas.

Without any doubt such hickups will happen more often in the future. It would be good if we could see relatively quickly when certain radars stop providing data. A crude but simple solution would be if next to each country folder and radar folder the date of latest addition would be shown at http://enram.github.io/data-repository/. Is that a tweak that could be made before the start of this spring migration season (Mar 1), or does that require a real project?

peterdesmet commented 6 years ago

You mean adding a last modified date like this?

screen shot 2018-01-12 at 12 15 39

Unfortunately, directories are not objects (with properties) in S3, so we can't add a property like last modified. As mentioned in this stackoverflow we'd need to keep our own log file to have that kind of information. Since we upload daily, we could have an automatically updated file radars.csv in the root with:

bejab 2018-01-01
bewid 2017-12-06
...

Would that be sufficient?

adokter commented 6 years ago

Yes filling out your red box is what I meant. But one file with latest modification date for each radar would be good enough too.

peterdesmet commented 6 years ago

I'll let @stijnvanhoey answer what deadline is possible for creating (and updating) such a file. 😊

stijnvanhoey commented 6 years ago

I can quickly provide the following information:

stijnvanhoey commented 6 years ago

The code to create a file radars.csv is integrated with the coverage procedure. Needs further testing and will be available asap. The file looks as follows:

countryradar,datetime
bejab,2016-10-09 23:50
bewid,2018-01-13 17:00
bezav,2016-10-09 23:50
bgvar,2016-10-09 23:55
ctcdv,2016-10-09 23:56
ctpda,2016-10-09 23:56
czbrd,2016-10-09 23:45
czska,2016-10-09 23:45
deboo,2018-01-15 07:30
...

First version of the file uploaded with the library (not yet the pipeline): check http://enram.github.io/data-repository/

stijnvanhoey commented 6 years ago

The creation of the radars.csv file is part of the daily data pipeline as well and running as required. We can later extend the functionality (automatic alerts,...), but closing this topic for now.

adokter commented 6 years ago

thanks for sorting this out so quickly!

peterdesmet commented 6 years ago

Nice! Would maybe update header datetime to datetime_latest_data?

plieper commented 6 years ago

Hi guys,

I had a look at the repo to see if data is coming in regularly. The coverage.csv does come in handy then. One question though: what is the number you get in the column vp_files? The total number of vp's for the last day data was transferred? Or the total number of vp's that's available on the repo for that radar?

adokter commented 6 years ago

it's by day

peterdesmet commented 6 years ago

Pretty sure it is the number of vp files for that date.

plieper commented 6 years ago

Thx!