AlertaDengue / PySUS

Library to download, clean and analyze openly available datasets from Brazilian Universal health system, SUS.
GNU General Public License v3.0
178 stars 70 forks source link

Error in download Function of pysus v0.10.4 with Unpackable File Names #199

Closed Twetler closed 4 months ago

Twetler commented 4 months ago

How to reproduce my error

Using version 0.10.4 from pysus.online_data.SIA import download download('BA', 2024, 4, groups = ["BI"])

image Which refers to:

def format(self, file: File) -> tuple:
        if file.extension.upper() in [".DBC", ".DBF"]:
            digits = ''.join([d for d in file.name if d.isdigit()])
            print("This",file.name.split(digits))
            chars, _ = file.name.split(digits) # This line crashes
            year, month = digits[:2], digits[2:]
            group, _uf = chars[:-2].upper(), chars[-2:].upper()
            return group, _uf, zfill_year(year), month
        return ()

After receiving lots of values like this ['BIMG', ''] which is easily unpacked, I got a ['BIMG2305_1'] which can't be unpacked as a tuple.

My shortfix was to extract the List's first item [0] however I don't know if this solution can be used for all files. It also raised a doubt if this logic is reproduced in the other libraries.

fccoelho commented 4 months ago

It seems that now Datasus is splitting the larger files in two for the bigger states: half of the data goes into BIMG2305_1 and the other half into BIMG2305_2 in your example. @luabida can you add a rule to handle this?