personam-solis / cladis-notitia

Disaster Information
MIT License
0 stars 0 forks source link

Get Data #2

Open personam-solis opened 9 months ago

personam-solis commented 9 months ago

There are two primary locations that the disaster locations are obtained from:

Figure out how to get the data into the local system (do not store/process). All data should be in memory and not stored as ETL will come later. Add all data sources and attributes to the README and see if there are any other sources that can be used as well

personam-solis commented 8 months ago

Another one: EM-DAT

The Centre for Research on the Epidemiology of Disasters (CRED) distributes the data in open access for non-commercial us

This is a "complete" database of worldwide disasters since ~1900

personam-solis commented 8 months ago

REGISTER!!!!! https://public.emdat.be/register

personam-solis commented 8 months ago

When connecting to the data, also describe the source, and what item/field will be needed

personam-solis commented 8 months ago

GeeksForGeeks

Setting stream parameter to True will cause the download of response headers only and the connection remains open. This avoids reading the content all at once into memory for large responses. A fixed chunk will be loaded each time while r.iter_content is iterated

import requests

gdacs_xml_link = 'https://gdacs.org/xml/rss.xml'

# create HTTP response object. Might need to change stream
http_reponse = requests.get(gdacs_xml_link, stream=True)

http_reponse.close()
personam-solis commented 8 months ago

All data can be from 2000 to limit the size; if you want more, it can be a future thing

personam-solis commented 8 months ago

EM-DAT has been gotten. it's an .xlsx So it needed to be converted to a .csv

personam-solis commented 8 months ago

Testing CSV parser

personam-solis commented 8 months ago

all actions completed

personam-solis commented 8 months ago

there is a good chance that the databases have to be heavily modified.

Also, dont worry about putting it in draw.io; let get to coding faster