uvacw / inca

24 stars 6 forks source link

Standardise timezones for RSS and non-RSS scrapers #278

Open theoaraujo opened 6 years ago

theoaraujo commented 6 years ago

Scrapers should have clarity on the timezone that they collect the data, and preferably some kind of cleaning to have an extra key for the timestamp at a standard timezone.

damian0604 commented 6 years ago

This can probably be achieved by supplying a timezone argument to the datetime object.

Right now, dates are parsed and stored as follows

from datetime import datetime
publication_date = datetime(2017,12,1)

Instead, we could do

import pytz
from datetime import datetime
publication_date = datetime(2017,12,1, tzinfo = pytz.timezone('Europe/Amsterdam'))

pytz.all_timezones is a list of all possible timezones