okfn-brasil / serenata-toolbox

📦 pip module containing code shared across Serenata de Amor's projects | ** Este repositório não recebe atualizações frequentes **
MIT License
154 stars 69 forks source link

Make .env file completely optional #184

Closed cuducos closed 6 years ago

cuducos commented 6 years ago

What is the purpose of this Pull Request?

Fix bug #183

What was done to achieve this purpose?

I made all environment variables completely optional and adder a note on README.rst that only core developers uploading new datasets need to care about these variables.

How to test if it really works?

  1. Install the package from this branch (not from PyPI)
    pip install https://github.com/okfn-brasil/serenata-toolbox/archive/cuducos-optional-env-file.zip
  2. Run some example code from the README.rst such as:
from serenata_toolbox.chamber_of_deputies.reimbursements import Reimbursements as ChamberDataset

ChamberDataset('2018', '/tmp')()

It should work even without any environment variable configured. Trying to download from S3 should fail though: our S3 died and we are setting a new one (new PR to come).

Who can help reviewing it?

It was reviewed from the user poit of view by @vilapedro.

@mnunes: would you mind testing to check if it fixes the bug in your use case?

@irio, @anaschwendler and @jtemporal can offer a technical review too : )

TODO

anaschwendler commented 6 years ago

Just checked, and it is working, let's continue with the new bucket 🎉

mnunes commented 6 years ago

Nope, isn't working for me:

Example 1

>>>> from serenata_toolbox.datasets import Datasets
>>>> datasets = Datasets('data/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/site-packages/serenata_toolbox/datasets/__init__.py", line 57, in __init__
      bucket=self.remote.bucket,
AttributeError: 'RemoteDatasets' object has no attribute 'bucket'

Example 2

>>> from serenata_toolbox.datasets import fetch, fetch_latest_backup
>>> fetch('2016-12-06-reimbursements.xz', 'data/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/serenata_toolbox/datasets/__init__.py", line 78, in fetch
    datasets = Datasets(destination_path)
  File "/usr/local/lib/python3.6/site-packages/serenata_toolbox/datasets/__init__.py", line 57, in __init__
    bucket=self.remote.bucket,
AttributeError: 'RemoteDatasets' object has no attribute 'bucket'
>>> fetch_latest_backup( 'data/')  # yep, we've just did exactly the same thing
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/serenata_toolbox/datasets/__init__.py", line 83, in fetch_latest_backup
    datasets = Datasets(destination_path)
  File "/usr/local/lib/python3.6/site-packages/serenata_toolbox/datasets/__init__.py", line 57, in __init__
    bucket=self.remote.bucket,
AttributeError: 'RemoteDatasets' object has no attribute 'bucket'

I'm using Python 3.6.4 (default, Mar 9 2018, 23:15:03) on macOS 10.13.3. Also, my serenata-toolbox version is 14.0.3:

$ pip freeze | grep serenata
serenata-toolbox==14.0.3
cuducos commented 6 years ago

Hi @mnunes – thanks for the feedback.

In fact the bug you pointed in #183 (decouple.UndefinedValueError: AMAZON_REGION not found.) seems to be fixed by this PR: examples 1 and 2 you shared shows a different error (AttributeError: 'RemoteDatasets' object has no attribute 'bucket').

This second error is related to the TODO list i shared in the opening post:

Setup a new bucket/storage (in a new PR)

Luckily this then new PR is already done and merged: #187.

Let us know if version 15 works from head to tail if you have the chance to test it ; )

mnunes commented 6 years ago

Version 15.0.0 is working like a charm. Thank you!