AlertaDengue / PySUS

Library to download, clean and analyze openly available datasets from Brazilian Universal health system, SUS.
GNU General Public License v3.0
178 stars 70 forks source link

FTP Connection Problem in 'Directory' Initialization #187

Open MarceloNG opened 10 months ago

MarceloNG commented 10 months ago

Description I'm encountering a domain name resolution issue when importing the download function from the pysus.online_data.SIA module. The error seems to be related to an attempt to establish an FTP connection within the new method of the Directory class. This behavior can cause failures in environments with limited or unstable connectivity.

Complete Error Message

Traceback (most recent call last):
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/__init__.py", line 267, in __new__
    directory = CACHE[path]  # Recursive and cached instantiation
                ~~~~~^^^^^^
KeyError: '/dissemin/publicos/SIASUS/199407_200712/Dados'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/__init__.py", line 267, in __new__
    directory = CACHE[path]  # Recursive and cached instantiation
                ~~~~~^^^^^^
KeyError: '/dissemin/publicos/SIASUS/199407_200712'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/asdf/repositorios/sus/magica.py", line 8, in <module>
    from pysus.online_data.SIA import download
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/online_data/SIA.py", line 15, in <module>
    from pysus.ftp.databases.sia import SIA
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/databases/sia.py", line 7, in <module>
    class SIA(Database):
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/databases/sia.py", line 10, in SIA
    Directory("/dissemin/publicos/SIASUS/199407_200712/Dados"),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/__init__.py", line 293, in __new__
    directory.parent = Directory(parent_path)  # Recursive
                       ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/__init__.py", line 285, in __new__
    raise exc
  File "/home/asdf/.cache/pypoetry/virtualenvs/sus-ZSlS6Gat-py3.11/lib/python3.11/site-packages/pysus/ftp/__init__.py", line 270, in __new__
    ftp.connect()
  File "/home/asdf/.pyenv/versions/3.11.5/lib/python3.11/ftplib.py", line 158, in connect
    self.sock = socket.create_connection((self.host, self.port), self.timeout,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asdf/.pyenv/versions/3.11.5/lib/python3.11/socket.py", line 827, in create_connection
    for res in getaddrinfo(host, port, 0, SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/asdf/.pyenv/versions/3.11.5/lib/python3.11/socket.py", line 962, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

Implementation of the Directory Class

class Directory:
    name: str
    path: str
    parent: Directory
    loaded: bool = False
    __content__: Dict = {}

   def __new__(cls, path: str, _is_root_child=False) -> Directory:
        ftp = FTP("ftp.datasus.gov.br")
        # ... [resto do código omitido para brevidade]
        try:
            ftp.connect()
            ftp.login()
            ftp.cwd(path)  # Checks if parent dir exists on DATASUS
        # ... [tratamento de exceções e restante do método]

Preliminary Analysis: The attempt to establish an FTP connection within the new method of the Directory class appears to be the cause of the problem. Typically, importing a module is not expected to immediately attempt a network connection, as this can be problematic in certain environments.

Suggestion for Resolution: A possible solution could be restructuring the logic so that the FTP connection is not established during the initialization of the Directory class, but rather when explicitly needed. This could prevent connectivity failures during import and make the library more robust in different network environments.