AlertaDengue / PySUS

Library to download, clean and analyze openly available datasets from Brazilian Universal health system, SUS.
GNU General Public License v3.0
173 stars 68 forks source link

[FEATURE]: Population per state/year on IBGE #191

Closed fccoelho closed 3 months ago

fccoelho commented 5 months ago

The dataSUS FTP has this info in the directory /dissemin/publicos/IBGE/pop

This should me made accessible via the IBGE module

maxbiostat commented 5 months ago

Gotta be careful with how these projections are done, however.

turicas commented 5 months ago

I've implemented the populacao_estimada.py script which finds, downloads and converts/normalizes data from IBGE. It currently downloads and normalizes estimates only but I'm willing to also add support for census data. Maybe we can reuse some of that code here. The result CSVs are hosted on repository, so you can check if it has the information you need (it can download more than one estimate for year, if available).

fccoelho commented 5 months ago

Interesting because it pulls from IBGE FTP server directly. What we are doing so far is to use the data collected made available by DATASUS.

I think it is worth to integrate your script as well, and make available both sets of estimates, because sometimes people have reasons to stick with one or the other source. Also, the estimating methodologies for non-census years may be different. @luabida can you take a Look at @turicas code?

fccoelho commented 5 months ago

It would be nice to adapt the code to use the same strategy to scan the FTP server direcly as we do in PySUS for the DATASUS FTP server, instead of relying on a list of hardcoded urls for the xls files.

turicas commented 5 months ago

It would be nice to adapt the code to use the same strategy to scan the FTP server direcly as we do in PySUS for the DATASUS FTP server, instead of relying on a list of hardcoded urls for the xls files.

The list of XLS files is hard-coded just because I didn't want to scrape everytime I run the script (since it does not change very often), but I implemented the code to extract the URLs automatically (so you don't need to rely on that hard-coded dict).

luabida commented 3 months ago

Finished on #193, the code can be seen here