Closed esloch closed 2 years ago
@fccoelho, I was discussing with @esloch, the better way to store the Brazilian data in epigraphhub, and we would like to have your opinion on it.
Since the SUS provides a specific dataset for each disease and respective year, we were thinking about keeping the same structure in epigraphhub. Because this data normally has a lot of error values and we can have inconsistency between the columns of datasets from different years, it would be difficult to append the datasets in just one.
My idea was to put the datasets organized in this way and, for each disease, create a consolidated table with just the columns more relevant (making a preprocess on it) to use in our analyses and dashboards.
What are your thoughts about it?
Yes we should keep each disease on a separate table. The table should reflect the original structure with some preprocessing that is already implemented on PySUS.
Em qui., 22 de set. de 2022 14:09, Eduardo Correa Araujo < @.***> escreveu:
@fccoelho https://github.com/fccoelho, I was discussing with @esloch https://github.com/esloch, the better way to store the Brazilian data in epigraphhub, and we would like to have your opinion on it.
Since the SUS provides a specific dataset for each disease and respective year, we were thinking about keeping the same structure in epigraphhub. Because this data normally has a lot of error values and we can have inconsistency between the columns of datasets from different years, it would be difficult to append the datasets in just one.
My idea was to put the datasets organized in this way and, for each disease, create a consolidated table with just the columns more relevant (making a preprocess on it) to use in our analyses and dashboards.
What are your thoughts about it?
— Reply to this email directly, view it on GitHub https://github.com/thegraphnetwork/EpiGraphHub/pull/134#issuecomment-1255313114, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABCGWYZAPVLINJ3K3RDAZ3V7SHFHANCNFSM6AAAAAAQNVPUPM . You are receiving this because you were mentioned.Message ID: @.***>
Done!
Resolve #123.
Requirements and features:
1 - Insert the datasus datasets with the PySUS library into the epigraphhub database. -- Credentials to connect the database. -- Schema and table name to write the data. -- Packages used in the module: pandas, pyarrow.parquet, pysus, psycopg2, sqlalchemy 2 - Module name sinan_fetch.py 3 - The code consists of 2 functions:
It remains to determine: