CharlotteJackson / DC_Crash_Bot

10 stars 7 forks source link

Scrape Pulsepoint.org #20

Closed CharlotteJackson closed 3 years ago

CharlotteJackson commented 3 years ago

What is the Task

What do we want to accomplish? scrape Pulsepoint website for locations and times of vehicle crashes

Why do we want to do this

Why do we want to do this task? because DC crash data sucks

How can I get started?

How can we start this task? read some beautiful soup documentation and sample scripts

Definition of Done

How do we know this task is done? when we have an open pipeline into AWS

banjtheman commented 3 years ago

Some resources https://www.reddit.com/r/webscraping/comments/jtml5w/what_is_the_best_way_to_scrape_content_from_a/

https://github.com/gonzoblue/sdfdbot/blob/master/sdfd.py

banjtheman commented 3 years ago

Found golden goose https://gist.github.com/Davnit/4a6e7dd94d97a05c3806b306e3d838c6

banjtheman commented 3 years ago

closed by #22