AlertaDengue / PySUS

Library to download, clean and analyze openly available datasets from Brazilian Universal health system, SUS.
GNU General Public License v3.0
175 stars 68 forks source link

chore(online_data.SIH): add group parameter to download SIH data #135

Closed luabida closed 1 year ago

luabida commented 1 year ago
11:20 $ ipython -i pysus/online_data/SIH.py
...
In [2]: download('SP', 2020, 1, group='SP')
Out[2]: '/home/luabida/pysus/SPSP2001.parquet'

In [3]: import os

In [4]: os.listdir('/home/luabida/pysus/SPSP2001.parquet')
Out[4]: 
['2defb933563a43889df3b249e6203318-0.parquet',
 'aee346b32f8a4395b4114e586be487f1-0.parquet',
 '3539f55cdf654b9f806c9eca82fe4a23-0.parquet',
 '38c2a0dc15154fff97d92758986a943d-0.parquet',
 '168682c345d94b629e3164aea7cd84b2-0.parquet',
 'd6812235a3334ceb878fbc545c79b9ee-0.parquet',
 '86cd9ff16b4d4ae3a086376b018f2100-0.parquet',
 'd6c86ec54d7f42a4a5646ab6cf2a40e2-0.parquet',
 '9e166810bdff4146a8bba90b3f125692-0.parquet',
 '546b525e723f4df89468bc1dd558f8f6-0.parquet',
 '9dcdf7b14c2f4737874a517f0255b34e-0.parquet',
 '615db6785c5040a39115f7d62f89c970-0.parquet',
 '7e134d969fd143da9a240d5eba9e14e7-0.parquet',
 'fcfec60757854c569ffaf69bff38d521-0.parquet',
 'a3a1899d5d45418e9ae29158fbb4ddb6-0.parquet',
 '376d62f6864d47baa029e590ec4f16a9-0.parquet',
 'b3d0613f260d411d8ec9f045f2bef4bf-0.parquet',
 '405d81c79fae4bd5a4a104701c46fbf5-0.parquet',
 '32319909bd4b4e27b3681ef9d3bf9b65-0.parquet',
 'aab7d82fca0a41529dcc6380c3ee5873-0.parquet',
 '3f42542397234b9aa5a24464f03920d6-0.parquet',
 'e43cec75c48c4998ae8916481577e4d8-0.parquet',
 '4140110be7bd4657bb21cec5472c00dc-0.parquet',
 'e57a10124a434a4f8859229a1c9eb691-0.parquet',
 '4dde0c1be42a4c8ea6a8701854263edc-0.parquet',
 'bb7121ccc0ae4cd2bc69465cb538564e-0.parquet',
 '53327255efa94667a3be1e5893cb54f0-0.parquet',
 'c54f9de361c44a6b95b171bdf56cd5b8-0.parquet',
 'c8da6e006e69461db34e0a9f1225a712-0.parquet',
 '4bca69ef7aed4142ac0c12b7e6f9891f-0.parquet',
 '58a26243e6e4410cbec81a039e1abade-0.parquet',
 'a3362332c5a04d60beb9c74e217395d7-0.parquet',
 '5a6273a76e2841589f80af8b7e398f6e-0.parquet',
 '566b09836dee4343865a36c053f44810-0.parquet',
 '184ce396c2d94e238d60ad25f83a82ca-0.parquet',
 'ba7d8901b88f49cbb7e8f20c550c432f-0.parquet',
 '27fdbb08427b410094293f44434f774c-0.parquet',
 '2e94665bec564897aaa5e12588c437e6-0.parquet',
 'b5c4a580b76f4c0baec165b7eb794bf0-0.parquet',
 '06e9c47e22af403dbf493ba8dcb77735-0.parquet',
 '4a0fda1e41884107b81c60448165d308-0.parquet',
 'f638780076f94347b76fe3d2446df89b-0.parquet',
 '74bf8969ea16401f85635293903046a8-0.parquet',
 'b30295df99bc4c0ca2aaa57613fa7df2-0.parquet',
 '93e204d7e1464f99a293e7b3c36d7126-0.parquet',
 'cd821cd6a0b9433ea038bc00d39aeb39-0.parquet',
 '26f72612f74142c2a4137b13466dbbd0-0.parquet',
 '4aed98e6eb614203a66d5ee729e90f18-0.parquet',
 '7cc3f03b102c4511816c2592939bc4f4-0.parquet',
 'd8864e2933b54a4bbec2645b1241852c-0.parquet',
 'be4858a014274d7a8eaae670acfbfbc9-0.parquet',
 '4d924efbe27446c295bb5c87c941c827-0.parquet',
 '2fb3d71c00744505a67dbbb196831a72-0.parquet',
 '10bbf45be1e44f5eb1724f4d9c19f26a-0.parquet',
 '44fa859c7e54412393ee40cf8866282d-0.parquet',
 'ae4427805f0546c8811140989d6c910d-0.parquet',
 'f61246098cd4445d9fac02bf2d0cf183-0.parquet',
 'e8f457de621c40d28e57c901ab7843e7-0.parquet',
 '58637aed287944de8b64024391e21a6a-0.parquet',
 'f37e2809ba8c4a3192e8dbad622b000e-0.parquet',
 '2dfa688c18ae4dd5bdc9aa1c23b30754-0.parquet',
 '3a1fd8f2ad2545f388e6ec95ed6e1b47-0.parquet',
 '691447aca62747c5ab4d2f4f0ee10b66-0.parquet',
 'eb24f2616c5b4243a660048df373504d-0.parquet',
 'aa0d61f7f13945929438ccc45952073b-0.parquet',
 'df41b48bb1da45d0afb93ecc6bf57f66-0.parquet',
 '656b3d599e0c4f01ad8938c9b24ab003-0.parquet',
 '993fcc3c5a4b4109a80a5c54fdfaf77a-0.parquet',
 '2d98027457544b8b944010fe6c17abdc-0.parquet',
 '7769445cb66b483289b54c0313b0f5cc-0.parquet',
 '3922e7ff14a0406daaecdfbd6788ba5d-0.parquet',
 '023bd8e1a8534a0581dbe1e71dec901d-0.parquet',
 'bdae06bfef11460da6e922daf28f4bdb-0.parquet',
 '493c0c9e42494c4cb4db757d6d6a40b6-0.parquet',
 'e8b28f7b91704fa9b1d3fd891bb16295-0.parquet',
 '156532b0492240f99bfc1e5d85ba106b-0.parquet',
 '869ba7300c8e49bcac6ef9ad7f5ec896-0.parquet',
 '8b0e6529173747969e94f44f0c1bf3b5-0.parquet',
 '0983de6a7453436f96501de5254b4b4d-0.parquet',
 'dcfe6f8ba0784296aac362b6296b1b4f-0.parquet',
 '3a6d29da2aa746d3b412eb2847c46b8c-0.parquet',
 '29e98a2b455b42f2b569621f5d9db784-0.parquet',
 'ad1e30bada9e484dace35dad807d684c-0.parquet',
 'ae0f9c980dbc4c7da8c4bd329aefc448-0.parquet',
 'dde0641eb7b545519c1dbe1269676206-0.parquet',
 '3609edf268304986902efc0e7344f70b-0.parquet',
 '0ac7df5c0e5844fb97bbbc889a948e77-0.parquet',
 '946f903416c3436382f74c41fdf7fdf2-0.parquet',
 'd4dc5a45359a40aaa356046871d736d9-0.parquet',
 '90356430183e4847b032003794f8de51-0.parquet',
 'e622d10a7bfc4d8a891025fad4415055-0.parquet',
 'dbde4920b7f8408f94a0509b70bd1d08-0.parquet',
 '6458b44df79d4c37b4efb6038c14fd13-0.parquet',
 '715c91dabdae4c13ae629f4198892e8e-0.parquet',
 '3f8f10a02a0b418391b3c2312dac6374-0.parquet',
 '434056b8085a44aa8bbe438fd711c3c3-0.parquet',
 '548c3ec57ccc4d4f986d2c33ec66c766-0.parquet',
 '51cd2cadd2604377908515f98b4a8660-0.parquet',
 'ac4b759d7e964e8ebc02d99085c0acf2-0.parquet',
 '913afef227624295a2c0e8e69371b5fa-0.parquet',
 '33aab3b0d4aa42518649bdfb57a3c379-0.parquet',
 '8de120c11379433b9053a88ad2bdaf71-0.parquet',
 'a418ac5f61854ad0aa7970717c3de20d-0.parquet',
 'db8750f1966c441eaf53e3e4cf1a5499-0.parquet',
 'bf5fdcea57d7485ca528f6aa44835d53-0.parquet',
 '08040daed36d4992b8f5dc08d34f281e-0.parquet',
 '082ae46016ea460295b62497b970346e-0.parquet',
 '6e13e5c9d9a7428cb14ccc6c9532c037-0.parquet',
 '9d59c4f989a944628138922107fcd58b-0.parquet',
 'be92ac10e5bb4d46a5197b21a5e72e79-0.parquet']
luabida commented 1 year ago

@fccoelho the download takes too long (not sure if it could be improved), but also the code as it is, it will be barely impossible to add a progressbar as tqdm, thinking on refactoring a bit :s

luabida commented 1 year ago

https://github.com/psf/cachecontrol/issues/292#issuecomment-1535373382

fccoelho commented 1 year ago

I think the other group's files can be very large, for the tests, use a small population state, such as Amapá, for instance, instead of São Paulo.

luabida commented 1 year ago

Could you add some tests for downloading the other groups?

I'll first give some try on the refactoring, then it will be easier to create the tests. If this is the block, this PR can be merged and I'll add it in the next one I'm on rn

github-actions[bot] commented 1 year ago

:tada: This PR is included in version 0.9.3 :tada:

The release is available on:

Your semantic-release bot :package::rocket: