Closed pgdr closed 7 years ago
I'm quite sure, if air_in_bergen.py
is still in use, that the offending line is:
current_date = date.today().isoformat()
# ...
ts = current_date + "T" + df[i][tid] + ":00Z"
See air_in_bergen.py#L51.
In any case, in the database, the following entry can be found:
BG_1_PM25";"2017-02-25 00:01:39.195074+01";"2017-02-26 00:00:00+01";14.6;f
The first time stamp is time_recieved
(sic) and the latter is the timestamp_data
.
If air_in_bergen.py
is still in use, we should use datetime.strftime
correctly.
Looks good. It would be nice to get correct timestamps on these sensors :)
I think I understand what's wrong now. It's actually in the line
current_date = date.today().isoformat()
that completely ignores timezones, which once a day gives the wrong date compared to the date displayed at luftkvalitet.info.
Okay, this will still be wrong; we need to change the URIs. If you look at one sensor, you see that we get a full timestamp, that is, date+time.
If we parse the table with several sensors, we only get the clock, and not the date to which the clock belongs.
That means that if we read the website 2017-03-13T00:01:00
and the website hasn't updated yet, we will read clock 23:00
and a value, and interpret that as 2017-03-13T23:00:00
. If we received the full timestamp, then we would get 2017-03-12T23:00:00
Can we change the scraping to a site which exposes the full timestamp?
Hehe, this wasn't easy. I haven't found any other sites that we can use. Then best option would be to get access to their API if they have one.
@njberland You have some contacts in NILU - can you ask them how we can access their data in a better way?
I changed the cron-job (or whatever) in the Friskby admin to run 10-past each hour instead of o-clock. It is not the solution(!), but it may solve the issue is many cases.
I changed the cron-job (or whatever) in the Friskby admin to run 10-past each hour instead of o-clock. It is not the solution(!), but it may solve the issue is many cases.
Very good! I thought about something like that (but actually around 15 or even 30 minutes past the hour) yesterday, but couldn't figure out how or where to do it. I think 10 past works; we can even verify that in our rawdata table; (e.g.) timestamp received: 09:11:40
and timestamp data: 09:00:00
.
Compare to prior to cron update: (e.g.) timestamp received: 09:01:50
and timestamp data: 08:00:00
.
@njberland requested changes to luftkvalitet.info: it's enough if they simply add the date stamp somewhere (preferably in the table) on their webpage. As it stands today, it is impossible to know the date of the previous measurement on Danmarksplass.
Seems fixed now, thanks @oyta
When we are given data from BG sensors at midnight (say 00:00 the midnight between day 1 and day 2), we interpret the data to belong to next midnight (end of day 2).