Minor dependencies issues with pip #405

Closed karldw closed 4 years ago

karldw commented 4 years ago

Describe the bug

A basic pip install doesn't include all necessary dependencies.

To Reproduce

conda create --name pudl_test_env python=3.7 pip --yes
conda activate pudl_test_env
pip install catalystcoop.pudl
python3 -m pudl
#> File ".../pudl/convert/", #> line 33, in <module>
#>     import pyarrow as pa
#> ModuleNotFoundError: No module named 'pyarrow'

Expected behavior

I expected a pip install to install everything necessary to import the pudl module. This could either be fixed by adding pyarrow as a dependency, or by only loading pyarrow in the functions when the user actually wants to run that code.

Software Environment?

zaneselvans commented 4 years ago

Hmm. IIRC I was trying to avoid needing to load pyarrow all the time because it has some compilation issues that were messing up the docs or tox on OS X or something. But this might be leftover from before I figured out to mock modules for the docs. You probably saw already but you can tell it to install the parquet requirements with the parquet "extras" I will see about integrating pyarrow and parquet into the main install_requires Thanks for being our guinea pig!

The master branch still depends on psycopg2 but it should be (have been?) removed from @cmgosnell data-packaging branch, which no longer relies on postgresql.

zaneselvans commented 4 years ago

Also yes, I will add badges for PyPI and conda-forge as soon as the package is available via conda.