okfn-brasil / serenata-toolbox

📦 pip module containing code shared across Serenata de Amor's projects | ** Este repositório não recebe atualizações frequentes **
MIT License
154 stars 69 forks source link

Refactor the way Chamber of Deputies' reimbursements are downloaded and cleaned #179

Closed Irio closed 6 years ago

Irio commented 6 years ago

For quite some time, working on these classes have been taking more effort than expected. This refactoring comes to address multiple issues: code complexity, memory usage and storing, multiple times, the same dataset in disk.

How to use it:

from serenata_toolbox.chamber_of_deputies.reimbursements import \
    Reimbursements

down = Reimbursements(2018)
print('csv path', down())

TODO

cuducos commented 6 years ago

Investigate the test failing in the CI but passing in my env.

TypeError: Use async with instead seams to be from aiohttp, the async library to handle async HTTP requests in the toolbox. The __enter__ method raises that error pointnig us to use __aenter__. In other words, pushing us to use the async version of the context manager:


# this
async with foobar():
    pass

# instead of
with foobar():
    pass

TLDR Probably we need to add async to this with block:

            async with ClientSession(loop=loop) as client:
                yield from downloader.fetch_file(client, '2016-12-06-reibursements.xz')
Irio commented 6 years ago

Unit tests passing but timeout on the journey one.

cuducos commented 6 years ago

Unit tests passing but timeout on the journey one.

Running it locally just in case. I would just push the travis_wait further (something absurd just to check if it's a real timeout or an error; then fine tune it to something reasonable, maybe using time if it's available in the CI).

cuducos commented 6 years ago

Passed locally here too, probably a higher number in the travis_wait is gonna fix the broken CI.