Closed denisalevi closed 8 years ago
@clauslang @ClaudiaWinklmayr @inesw @akresnia @gkBCCN Here is the updated version of the minimal dict:
{'city': 'cassel',
'daily': {'1': {'high': 7.0,
'low': 3.0,
'rain_amt': 1.325,
'rain_chance': 78.75,
'wind_speed': 13.5},
...},
'date': 26042016,
'hourly': {'00': {'humidity': 65.0,
'rain_amt': 0.0,
'rain_chance': 15.0,
'temp': 5.0,
'wind_speed': 3.33},
...},
'site': 1}
Accepted city names are now in english: ["berlin", "hamburg", "munich", "cologne", "frankfurt", "stuttgart", "bremen", "leipzig", "hanover", "nuremberg", "dortmund", "dresden", "cassel", "kiel", "bielefeld", "saarbruecken", "rostock", "freiburg", "magdeburg", "erfurt"]
accepted site ids: 0, 1, 2, 3, 4
@clauslang @ClaudiaWinklmayr @inesw @akresnia @gkBCCN This is how I use the test:
...
import test_scraper_output as tester
def scrape(date, city):
"""Scrape data for given date and city.
:param data: should be in the format 30-05-2016
:param city: should be the english city name, i.e., cologne, cassel, munich
"""
# get date id
dateInt = int(date.split('-')[0]+date.split('-')[1]+date.split('-')[2])
# scrape full data dictionary
data_dic = {'site': 1, # 'wetter.com' id = 1
'city': city,
'date': dateInt,
'hourly': scrape_hourly(date, city),
'daily': scrape_daily(date, city)}
# run tests
assert(tester.run_tests(data_dic))
#TODO add data to data base
# return nothing
...
I import the test script as 'tester'. In the scrape function I get the full data dictionary. Then I call the method run_tests(data_dic)
giving it the full data dictionary in the above format. The method just returns true if all tests pass.
Let me know if something does not work out. We probably have to adapt the tests for every provider, see #31
The tests will be a bit less strict on the city names in the dictionary. You can use any of
["berlin", "hamburg", "munich", "cologne", "frankfurt", "stuttgart", "bremen", "leipzig", "hanover", "nuremberg", "dortmund", "dresden", "kassel", "kiel", "bielefeld", "saarbruecken", "rostock", "freiburg", "magdeburg", "erfurt", "saarbrücken", "münchen", "koeln", "nuernberg", "köln", "saarbrücken"]
And I can also add more if yours are different.
If you need a function that finds the full file name of the html file given only date and city you could use:
import os
def get_filename(dirpath, date, city, mode='hourly'):
"""Looks up filename of the html file in dirpath for given date and city
:param dirpath: relative path to the data directory
:param date: date in the format 31-05-2016
:param city: city as string
:param mode: daily or hourly data
"""
path = None
filelist = os.listdir(dirpath)
for f in filelist:
if (date in f) and (city in f) and ( mode in f):
path = f
return path
Applies to #36
Done from my side. Tests are passing.
Everyone has implemented this, test call implemented by almost everyone, see #31
Here is also the minimal dictionary that @janfb set up. So everybody check that your output is of the right form (using the test from @janfb ?). @janfb can you give an example how to use your test?
@clauslang @ClaudiaWinklmayr @inesw @akresnia @gkBCCN
From @janfb :