Open felipequintella opened 4 years ago
problem is they don't provide a direct download link, at least not on the linked page. if you click the download csv file there you get a generated temporal dl link. :/
I noticed that too... I've actually been trying to scrape their website for that CSV link for the past week, and I think I finally managed. Or until they change something again... Scraping and final data is here: https://github.com/felipequintella/covid19-brazil-scraper https://raw.githubusercontent.com/felipequintella/covid19-brazil-scraper/master/brazil.csv
I've also forked covidtrends and included the breakdown there as well. If you think it is worth it, let me know and I can try a pull request. https://github.com/felipequintella/covidtrends Final product here: https://covid19.felipequintella.com/
Edit: of course, scraping their data also means the final data may be considered not as official, accurate and reliable as one might want for the project. Let me know what you guys think ;) "We don’t want to become a repository of many datasets, as it’s difficult for us to vouch for their accuracy and reliability."
hm basically at the end it is @aatishb choice how to handle things. on one hand its cool that you provide the opportunity with scraping the existing data from the government page. but imho i would be careful never the less about any further step for data aggregation for any country. but i've searched on google for "covid saude.gov.br csv" which lead me in here: https://brasil.io/dataset/covid19/boletim/ You found that too already?
i dont speak, i guess it is portuguese in brazil, so unsure if i understand everything correctly. i've only utilized translate.google.com a little. but Leia a documentação dessa tabela
lead to the following repo here: https://github.com/turicas/covid19-br/blob/master/api.md#boletim . suppose from there a query string might be crafted for their api? i guess the complete download there https://data.brasil.io/dataset/covid19/caso.csv.gz would be a little bit too extensive ;))) goes down even to the city level data wise it looks. the csv is 2,2mb in size. ;))) might be easier for a native speaker to find his or her way around there.
But tried the examples on the Github repo in Paw and I got a 301 for querying https://brasil.io/api/dataset/covid19/caso/data?is_last=True&state=AL
:/
guess the caso.csv.gz is the smallest but most complete at the same time available. it is also listed here alongside other versions, all provided sha512 checksums: https://data.brasil.io/dataset/covid19/_meta/list.html
@felipequintella the only thing i have issues understanding just with google translate. is that dataset aggregated by brazilian offcials/government employees? meaning is that a official data source or is that aggregation based on voluntary work?
Hey @rpkoller , I'll take a look at these and see how they are collecting/compiling and verifying the data. I had not found it before, looks good though! I'll revert later.
@rpkoller as per this link, they say all the data comes from each state health department, so yes, I would say it's aggregated by the authorities and brasil.io just compiles them in one big csv. This is their report on that.
Filtering the caso_full.csv by place_type == state would yield the result Felipe is talking about.
Brazil is a big country with many different hotspots. Including a breakdown per state (as per US, Canada, Australia) would help visualize what's happening. This should be fairly easy from the Health Ministry data (https://covid.saude.gov.br/)