Open sfkeller opened 8 years ago
After reading a lot of the project's code and experimenting a bit with the Zoom10Tiles.csv
file's content, I think that check_data_validity()
returned False
on the days where the bot recently set its Twitter status to
Trending places bot has been unable to find data for the last few days... It will return tomorrow
As we don't explicitly pass the period length with --period
, the default of 7
is used, so that check would fail if there are less than 7 unique dates with tile log data.
I think there were two reasons why this was the case, recently:
./main.sh
(and with it, Fetch2.py
) are run at midnight (00:00 UTC), for the 7-day-period ending 2 days ago. Sometimes the last anonymized logfile (the one for the day before the day that just ended) isn't present on http://planet.openstreetmap.org/tile_logs/ at that time, yet and thus won't be downloaded by Fetch2.py
.To fix the first reason, I'd change the cron job time to a later time in the (UTC) day, e.g. 3:00 AM.
For the second reason I'm not so sure. How should we handle missing logs, @sfkeller? Look further into the past to get enough historical data? Interpolate data for the missing dates?
Fetch2.py currently expects all parameters („--date_from“ „--date-to“) to be valid before midnight. But e.g. a file date "2016-07-30 00:04" in the log file directory still refers to "2016-07-29".