ActivitySim / activitysim

An Open Platform for Activity-Based Travel Modeling
https://activitysim.github.io
BSD 3-Clause "New" or "Revised" License
191 stars 98 forks source link

Compatibility Issues with Toolz (v11) #350

Closed danielsclint closed 3 years ago

danielsclint commented 3 years ago

I had to reinstall my ActivitySim environment, and I ran into a big compatibility issue with the Toolz (CyToolz) library. It looks like a more recent version (v11.1) has been pushed to the Conda / PIP warehouses.

When you run the tests against the new install version of toolz you get the following

       # Map registered variable labels to expressions.
        if not expressions:
            expressions = []
        offset = len(names) - len(expressions)
        labels_map = dict(tz.concatv(
>           tz.compatibility.zip(names[:offset], names[:offset]),
            tz.compatibility.zip(names[offset:], expressions)))
E       AttributeError: module 'tlz' has no attribute 'compatibility'

activitysim\core\orca.py:985: AttributeError
fscottfoti commented 3 years ago

https://github.com/UDST/orca/issues/50

jamiecook commented 3 years ago

Fixed in PR #349

josiekre commented 3 years ago

I have this problem right now too where we are testing with PopulationSim. We rebuild our virtual environment in a Docker frequently. Temporarily fixed it by fixing toolz to 0.10 until this pull request makes it into a release.

I thought I'd mention that I am testing pipenv to help manage deterministic builds. You might look into it for ActivitySim and PopulationSim so end-users can avoid this problem more easily. In theory, it makes build errors less of an emergency. https://realpython.com/pipenv-guide/

I forgot to mention the other options for this same thing: pip-tools and poetry. If you ever consider going down this path, feel free to quick in with me for an updated opinion.

jamiecook commented 3 years ago

@josiekre ... I'm not sure that pipenv would solve this type of problem would it? As far as I understand pipenv it allows you to specify version ranges for dependencies... but setup.py already does that

image

The problem is that the range is too open ended... it could be locked down with an upper bound, but that's less desirable. I would suggest that this is just an issue of a deprecation warning not being acted upon - compatibility has been deprecated for a while I believe.

josiekre commented 3 years ago

Ya maybe you're right for your use case. You're working on building a flexible library rather than production code. For our use case, we need a rebuildable environment (including deep dependencies) regardless of when we build it (repeatable build in production regardless of whether we install today or 6 months from now into a Docker). pipenv or some similar "lock" file aims to make this possible without having to manually track and resolve deep dependencies.

More context: we have a lot of problems with deep dependencies getting installed with newer versions of packages via pip depending on when the build process runs inside Docker. In this specific case, we import populationsim which imported activitysim which imported toolz. The build was tested on x date and built successfully. On y date, the same exact code will not build successfully anymore even using a Docker because pip pulls the latest toolz live rather than from a snapshot repository (like one can do with R). This makes reproducible simulations/research very difficult and a pain to debug when adding hotfixes into production. Using pip env or an old school pip freeze > requirements-prod.txt including all deep dependency versions seems to be only way to guarantee a rebuildable virtual environment that does not depend on others being on top of their game with deprecation warnings.

josiekre commented 3 years ago

Here's where it might become relevant for you: ARC's or ODOT's model turns out to have a small bug they need to fix for their deadline coming up in two weeks. Their model depends on a release of activitysim from three years ago that was installed on their machine 1 year ago, and they need to deploy the exact same setup plus this one little fix to their own wrapper.

They might be lucky and be inside a Docker but they will still need to pip install to fix the bug, and pip is going to grab the newest versions of dependencies of requests, like idna or pysocks. Someone is going to be pulling their hair out trying to whack-a-mole deep dependency differences that have been updated since by comparing pip freeze files and seeing which versions need to be locked down to get it to build/run. (In this case requests is playing nicely with a Pipfile.lock so this exact thing wouldn't happen, but you get the idea.)

jamiecook commented 3 years ago

@josiekre - I think you are right, pip freeze and pip env lock files are the only real way to solve that problem within the python ecosystem. To be clear, I work in the same space as you - modelling systems that utilise popsim/actsim and we have the same problems and we reached for Pipenv to solve them as well. Having something that used to work not work anymore because of some third party library upgrade is the most frustrating experience - I was just lucky that I was able to download the code for activitysim and make the fix locally (and then contribute it back here). Not so many modellers have the skills to do the same and will as you say, resort to hair pulling.