Open ElinaMak opened 6 years ago
I really like the angle you picked and all the efforts putting together the data. I think for the three analysis you've done, you can probably think about graphics other that bar charts to represent your data. (for instance, maybe pie chart for most common disasters in the US?)
Interesting idea! Would be interesting if there were a way to measure how destructive these disasters were. (e.g. What states have fires that destroyed the most amount of property/land etc
Update
For the 2nd draft of my project, actually, I tried to scrape the website that lists the natural disasters. So, not a lot of new stuff regarding visualizations but rather back to the basics of programming (on which more practice will help): scraping, for loops, regex.
I managed to scrape 1.000 events this time (instead of the 100 events which was my initial dataset). Hmm, an interesting process with several pitfalls on the way..
Challenges Firstly, it was not a good idea to scrape differently Region + type of the incident and then the date of the incident although they are in two different lines on the website. I should have scraped the one div class that contains both. But will keep all those steps on the notebook.
There was also a challenge to scrape the 50 pages with Beautiful soup as the first page had no No.1 so i scraped it differently and then use the url and incremented.
You did a lot of work regarding scraping and transforming data! Probably I would choose to include only one year in my dataset (for example, just 2017) and not mix historical years (both some months of 2017 and some months of 2018). Looking forward for the final visualizations!
Very interesting idea involving a lot (A LOT) of scraping and cleaning work. Would elaborate more on the graphics part just to finish up for now and go back in the future. There can be some very interesting data visualisations out of this data set.
Hi! I'm a little robot, here for a surprise inspection.
You need some feedback, let me summon @mattrehbein, @SimoneLuc, @hakantan for you
It looks like we need to fix up your your update a little bit! Edit it by clicking the pencil in the top right-hand corner. It requires:
Maybe you just didn't use the template? If not, edit your comment, cut and paste the template in, and then fill it out.
It's a really interesting and also a really big topic! For the purposes of the project, it would probably be easier to narrow the focus. Your last graph seems to be getting at a more specific interesting angle, but without an exact title I'm not totally sure what it's showing. But I like the combination of 'these states have the most natural disasters' and 'here's the most common type of disaster in those places.'
And I agree with above comments that a measure of how deadly some of the disasters are would really help give readers an understanding of the impact.
Nice job chasing a lot of data!
Here is the last update regarding the code: Too much scraping!
And I am still collecting data regarding the casualties..
Just by reading it, this seems to be a huge project with regards to getting the data in the first place, so kudos for that.
I would definitely include a name for the x-axis, because right now I don't know about the quantity.
Maybe it would be also interesting to group the states by type of natural disaster? In that case I could easily see if there are mostly fires in state X or landslides in region B etc.
agree with @kellykiki that ideally you would also zoom in one type of disaster and then show whether it increased during the last years.
you've done a lot of work with the data. good journey to have to sort through all that. it's never as simple as it seems. so keep in mind that california and texas are big states. big states have more of...everything. so you want to perhaps normalize some of the data by population otherwise all we get are...big states, lots of disasters.
once you've done that, (divided by population and make it a per 1,000 people or per 10,000 people maybe you can tell us which states have the most...fires, floods as a percentage of total disasters. that way we can see which state is most likely to have a tornado for instance. if you mapped this, you'd see there is a part of the us that is tornado prone. it's actually called tornado alley.
i like that you are curious about this, but i think you may want to narrow in on something here. a type of disaster, for instance, or an analysis of certain parts of the country. it's very broad. what interests you most here? give us more of that.
good work, keep at it.
Final comment
What did you find to be the most difficult part of this project?
The scraping!
Are you satisfied with what you produced? Is there anything you would like to change or improve?
I concentrated on the scraping and not on the graphic implementation of the data. And with more time for the research, I would investigate more the destroyed hectares and the casualties.
Please complete all of the following sections, or the ghost of Joseph Pulitzer will spookily dance around your issue! A completed version of this template can be found at https://github.com/jsoma/data-studio-projects/issues/1
Pitch
What is my question: -what kind of natural disasters exist in USA (fires, floods, tornados, mudslides, earthquakes, hurricanes etc)? Which ones are more frequent? -Are those areas populated? -If there is a type of a catastrophe that is a repeated phenomenon in a certain area / state? In that case, should there be state indemnity to compensate the losses of the local citizens? Should taxpayes pay for repeated catastrophic risks to people who insist living in an environment that they know for certain it will be destroyed?
Summary
Then I contacted the Federal Emergency Management Agency where they gave me access to the database: "Disaster declaration". Μy first thought was to scrape the pages but soon I realized that some terms were unknown to me. For instance, fires are categorized, some have a cetain type and name, hurricanes have names and so on, so not clear which values go where to the pandas dataframe. Due to time considerations, I thought to limit my research only to the natural disasters DECLARED in 2018 and fall 2017. I ended up with 102 entries. While building the database, I had to search / google for the event in order to understand the type of disaster as in several cases the event is written in the examined database in the following manner: ex. Arizona 89 East Fire. Moreover, in some cases there were some numbers that I could not identify if it was a code for the state or the type of the disaster (ex. Texas 335 Fire (FM-5234) One should be carefull because some events happened in 2015 but "declared" (term used by the Agency) much later -months, or even years as it is the case for Alaska-.
Details
Possible headline(s):
Data set(s):
I created my own dataset. Although I checked for official statistics, at Disasters | Data.gov, the available datasets were not clear and when I tried to open the pdf's, I got always the message: "Access denied". I emailed the Agency but so far, I have no answer.
So I decided to be more creative in my way of finding data FAST:
Code repository:
https://github.com/ElinaMak/data_studio/blob/master/01-dogs/Natural%20Disasters%20in%20USA-Copy1.ipynb
Possible problems/fears/questions:
Work so far:
Checklist
This checklist must be completed before you submit your draft.
[Project]
in the title