Closed Ace-Of-Snakes closed 3 hours ago
The finalised approach was to use python libraries like pytz to rescrape the 113k Datapoint for latitude and longitude and get their timezones. (took about 4hrs) After this was finished, the UTC deviation was calculated for each timezone. (For example Berlin = UTC+1). The big DataBase was split on integer values of UTC deviation -> [UTC - 10 : UTC + 14], creating roughly 24 new CSVs. The Concurrent-Script was rewritten to detect which UTC Timezone has the time 00:00 (important for data reasons) and is run every hour accordingly on the server.
After enlarging the Database of cities towards 113k cities a new approach is needed for crawling them every day.