ackwell / ninjabot

Ninjabot is not 'Just Another Bot'
10 stars 8 forks source link

weather #55

Open auscompgeek opened 11 years ago

auscompgeek commented 11 years ago

Suggestions:

Here's a whole list of weather APIs (filtered to JSON format).

auscompgeek commented 11 years ago

@keepcalm444 could you please share with us how you grab BoM data?

auscompgeek commented 11 years ago

Just for reference, this is how another bot behaves when you ask it for weather:

<auscompgeek> `bom sydney olympic park nsw
<notabot> Current Weather Conditions supplied by http://www.bom.gov.au for Sydney Olympic Park, NSW (As of 12:04pm):
<notabot> Temp: 18.1'C (64.58'F), App: 14.9'C, Humidity: 64%, Wind: Moderate breeze, Dir: SE, Sp: 19 Gu: 26 km/h, Min: 15.2'C at 01:10am Max: 20.3'C at 10:48am
fourkbomb commented 11 years ago

here's a gist (it's a php script, found in action here)

stationList is only for NSW though. If it's needed I can generate a few for other states.

auscompgeek commented 11 years ago

Here's a Whirlpool thread about the BoM API.

auscompgeek commented 11 years ago

Here's a Gist for generating the list of stations (and the actual lists themselves in INI format (yay)) for all the states, because the BoM has some screwed up stuff going on.

cyphar commented 11 years ago

Does it come with a JSON api? Or do we have to do some type of html scraping? @keepcalm444, can you comment? From what @auscompgeek posted, it looks like you're doing scraping using regex. I point you all to: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags.

ackwell commented 11 years ago

A fair chunk of the scraping is done via Regex for speed reasons (at least legacy speed reasons, I'm a bit out of touch with the codebase right now). In the python2 version, originally all scraping was being run straight through BeautifulSoup, however on low-power machines like RasPi, it was causing lags of up to a second. Regex was significantly faster.

cyphar commented 11 years ago

Fair 'nuff. I would like to see some legitimate speed tests (probably with some form of memory profiling too) to see whether or not this is still true with the current codebase (and with py3k). Also, we should do some CProfiler or perf profiling before coming to a decision like that.

ackwell commented 11 years ago

Yeah I really have no idea. Maybe after exams :3

auscompgeek commented 11 years ago

@cyphar The BoM has a JSON API (sorta) for weather data, however, you need the IDs of each weather station to do this. I'm not sure whether there's an API to grab weather station IDs.

Yes, I used regex to scrape HTML to generate the ID list. See the 2nd top-voted answer to the Stack Overflow thread you linked to. And yeah, what ackwell said.