openwallet-foundation / acapy-plugins

aries-acapy-plugins
https://hyperledger.github.io/aries-acapy-plugins/
Apache License 2.0
4 stars 26 forks source link

Plug-In dependency managament and version mismatch issues #28

Closed esune closed 10 months ago

esune commented 11 months ago

The Problem

Each plug-in declares a pyproject.toml with a broad set of dependencies - including ACA-Py - and a lockfile generated at the time of the last code update. When the plug-in code is installed on an image (or other system) where those dependencies already exist, poetry will ensure the dependencies declared in the latest pyproject.toml/lockfile are respected and override previously installed dependencies altogether. This is true for all dependencies, but particularly bad for ACA-Py as it will cause the runtime to potentially have a version different than the one targeted (i.e.: different than what the official image tag declares) and cause issues.

What to do?

While complex dependency management would be ideal, I also believe it is unlikely that we will want to create (and maintain) a dependency resolution system, especially since dedicated projects such as pip and poetry still seem to not have nailed this down.

What I would suggest is a mix of common sense and precautions when declaring dependencies: new dependencies in a plug-in should be kept at a minimum level, and dependencies that may conflict with the base system (such as ACA-Py) should be declared as dev dependencies rather than "regular" dependencies. This makes sense as plug-ins are not meant to be run standalone, and expect ACA-Py to be available on the system they are being installed on, and would prevent the installed version from being overridden at plug-in installation time.

This would still leave the issue for plug-ins that rely on a different (outdated or newer) version of ACA-Py to run: the solution in this case is simple: update the plug-in to run with the newer version of the dependency, or use an older image matching the requirement, whichever makes more sense for each use case.

A review of the currently available plug-ins and adjustments to each pyproject.toml, followed by the generation of new lockfiles would be the only required task to complete this.

esune commented 11 months ago

Tagging @dbluhm @jamshale @usingtechnology @WadeBarnes @Jsyro @shaangill025 for input/feedback on this.

usingtechnology commented 11 months ago

I like your thinking here. Since we are advocating the plugin be installed on an ACA-Py image then we can move the aries-cloudagent dependency from main to dev or its own group (similar to the integration dependencies).

dbluhm commented 11 months ago

I think moving ACA-Py to an optional dependency would mean that we miss out on the "power" of dependency resolution provided by systems like poetry. If we have plugin X and plugin Y and X works for any ACA-Py version greater than 0.9.0 but Y only works on exactly 0.10.3, poetry can solve for that and make sure ACA-Py 0.10.3 is installed. Or, in the case that your main project needs ACA-Py 0.10.4, it will fail because there was no way to solve the requirements.

The poetry.lock is not an expression of required versions of a package; it's the result of a previously run solution so we don't have to solve the full dependency tree every time we install or add a new package.

Setting up an image with ACA-Py and a set of plugins should probably be managed by a separate pyproject.toml that requires ACA-Py at the desired version and all the plugins and their respective versions. This will ensure that installations performed serially aren't clobbering dependencies of the last installed plugin and will also make it obvious when there are real conflicts between the plugins and not just the versions of the dependencies that happened to be selected when there were no other dependencies to consider by the last installation. Then the generated poetry.lock can be used in the image build to install a solution to that dependency tree at that point in time.

I feel like I might be missing something though -- let me know if I'm off base.

esune commented 11 months ago

Setting up an image with ACA-Py and a set of plugins should probably be managed by a separate pyproject.toml that requires ACA-Py at the desired version and all the plugins and their respective versions

I agree with your point about losing the dependency resolution bit. I think it might go down on "how" we are setting projects up, and what the expectation is on this repo to be used as.

The big problem (that I was focusing on) is the one we're currently experiencing with using the official ACA-Py images and having plug-ins "change" its contents unexpectedly because of dependency resolution.

As an example: I start with an ACA-Py 0.10.3 base image, and install the plug-in from the repository onto it (this would be my expectation, unless we are able to release plug-in packages to pypi or as GH packages). The plug-in code, however, happens to be outdated and depends on ACA-Py 0.9.0: when running the poetry install, the ACA-Py package will be downgraded from 0.10.3 to 0.9.0 and will cause unexpected problems (and confusion, since the referenced image tag would be at that point misleading). Add a couple more plug-ins installed one after the other and the variance in results increases exponentially.

dbluhm commented 11 months ago

I see what you mean :thinking: We've also experienced this before when using the ACA-Py images. Often, I'll just build from a base python image instead of working with the pre-built ACA-Py image. This is even easier now that we no longer depend on von-image to get Indy dependencies. We've had good success with this strategy but it would be nice to be able to use ACA-Py's image as a base.

dbluhm commented 11 months ago

Perhaps for release images we should use multistage builds. Stage 1 can pull from a standard python image and pull in the plugin code, install poetry, etc. This stage would conclude by running poetry build. Then, Stage 2 can pull from the standard ACA-Py image, copy the wheel from stage 1, install it as a standard pip package and then otherwise leave the ACA-Py image unchanged.

Would that solve the problem we're experiencing? ... Come to think of it, this strategy might still try to install a different version of ACA-Py in the process of installing the plugin wheel.

WadeBarnes commented 11 months ago

This is tricky ... What I've done and seen with plugins in the past (not python though). The main application (aca-py in this case) is installed (think official aca-py image). Plugins are either installed or simply copied into a given (known) folder, in such a way they do not impact the main application or it's dependencies (they can define their own over and above) - wrapper image. When the plugin is first invoked (or initialized) it checks the version of the main application and throws an error and fails if it is not compatible with the main application - runtime.

It's sort of a lazy loaded reverse dependency check done by the plugin. Not ideal that the failure happens at runtime though.

WadeBarnes commented 11 months ago

Another thought ... Is there a way to wire custom code into the install/setup process with Poetry (like setup.py allowed)? If so add some code that checks to ensure a compatible version of ACA-Py is installed and fail the install if not compatible. This code could be generic and simply included in each plug-in.

Perhaps using the [tool.poetry.scripts] feature?

usingtechnology commented 11 months ago

not sure there is much we can do other than due diligence with plugins in this repository. no matter what precautions we take if someone adds a 3rd party plugin all bets are off.

the issue is on the install, so how do you tell pip/poetry NOT to install the plugin and its dependencies at build time?

one suggestion for support would be the /plugins admin server API should print out plugin names and versions. i think we enhance plugins to relay their version and their known good ACA-Py version? if someone is getting weird behaviour at least we could see if they've installed a 3rd party plugin or an ill advised version.

can we tag this repo with ACA-Py versions too? I'm assuming that at some point we will be publishing individual plugins to pypi but these can get installed from github, so if we tag the repo with an ACA-Py version then you could install using that tag (ex. 0.10.3) like:

FROM ghcr.io/hyperledger/aries-cloudagent-python:py3.9-0.10.3
...
RUN pip install git+https://github.com/hyperledger/aries-acapy-plugins@0.10.3#subdirectory=basicmessage_storage
RUN pip install git+https://github.com/hyperledger/aries-acapy-plugins@0.10.3#subdirectory=connection_update
...
usingtechnology commented 11 months ago

could we investigate having the ACA-Py (and specific version) in the plugin's build-system? i haven't done it, but maybe that will work to ensure that the plugin (and dependencies) are only installed if that specific requirement is met?

plugin pyproject.toml

[build-system]
requires = [aries-cloudagent=0.10.3]
...
dbluhm commented 11 months ago

Another thought ... Is there a way to wire custom code into the install/setup process with Poetry (like setup.py allowed)? If so add some code that checks to ensure a compatible version of ACA-Py is installed and fail the install if not compatible. This code could be generic and simply included in each plug-in.

Perhaps using the [tool.poetry.scripts] feature?

Part of the Python community's motivation to using pyproject.toml instead of setup.py was actually to explicitly prevent this, running custom code on install. We could define a script but I think we'd still have to specifically call it.

can we tag this repo with ACA-Py versions too? I'm assuming that at some point we will be publishing individual plugins to pypi but these can get installed from github, so if we tag the repo with an ACA-Py version then you could install using that tag (ex. 0.10.3) like:

Not all plugins in the repo will necessarily support the same version of ACA-Py at the same time so tagging according to ACA-Py version doesn't seem like it will always work.

dbluhm commented 11 months ago

could we investigate having the ACA-Py (and specific version) in the plugin's build-system? i haven't done it, but maybe that will work to ensure that the plugin (and dependencies) are only installed if that specific requirement is met?

plugin pyproject.toml

[build-system]
requires = [aries-cloudagent=0.10.3]
...

https://python-poetry.org/docs/1.6/pyproject#poetry-and-pep-517

Docs suggest this wouldn't behave as intended; if we're using poetry, the build-system needs to point to poetry-core.

dbluhm commented 11 months ago

I think the best solution I can think of is to not use the ACA-Py image as a base image -- or, more specifically, don't rely on the version of ACA-Py included in a base image --and define all dependencies in a pyproject.toml + poetry.lock specific to that image.

usingtechnology commented 11 months ago

thanks for those notes about the scripts and the build-system. sigh... i'm all out of (simple) ideas. other than our due diligence. although I do like the idea of making the /plugins API a little more informative.

usingtechnology commented 11 months ago

I think the best solution I can think of is to not use the ACA-Py image as a base image -- or, more specifically, don't rely on the version of ACA-Py included in a base image --and define all dependencies in a pyproject.toml + poetry.lock specific to that image.

so we would leave aries-cloudagent as a main dependency for each plugin? or do we have pyproject/poetry.lock for a complete "official" image, where it would have aries-cloudagent as a dependency and all the plugins?

[tool.poetry]
name = "acapy_plus"
version = "0.10.3"
...
[tool.poetry.dependencies]
aries-cloudagent = "0.10.3"
basicmessage_storage = "0.1.0"
connection_update = "0.1.0"
...
WadeBarnes commented 11 months ago

I think the best solution I can think of is to not use the ACA-Py image as a base image -- or, more specifically, don't rely on the version of ACA-Py included in a base image --and define all dependencies in a pyproject.toml + poetry.lock specific to that image.

Defeats the purpose of publishing an official base image. 😞

WadeBarnes commented 11 months ago

What about using this feature; https://python-poetry.org/docs/master/plugins/

esune commented 11 months ago

I think the best solution I can think of is to not use the ACA-Py image as a base image -- or, more specifically, don't rely on the version of ACA-Py included in a base image --and define all dependencies in a pyproject.toml + poetry.lock specific to that image.

Defeats the purpose of publishing an official base image. 😞

I'm with @WadeBarnes on this one. I think all we can do is limiting the amount of havoc a mismatched dependency can cause, but we won't be able to make it completely fool proof.

swcurran commented 11 months ago

Comment from a clueless person. Do we have two use cases here?

One is getting the dependencies right for a run of a plugins integration tests. It should run with the latest version of ACA-Py for which it has been explicitly updated. Since there is a docker-compose.yaml file for each, this should be relatively easy to do.

The second is for a deployment where a deployer must run a specific version of ACA-Py and all of the plugins that they need. For that, the plugins are brought in with their “above and beyond ACA-Py” requirements. The deployer may need to update a plugin (yay!) to bring it in line with the newer version of ACA-Py they want to use. This would serve to show what plugins are valuable — the ones that are upgraded to support later versions. Even a PR that only says “Hey, this plugin also supports version x.y.z” would be helpful in that case.

A nice-to-have might be creating a tag with a plugin name and ACA-Py version at such times, to help others that are OK with not running the latest version of ACA-Py. They pull out a plugin that is specific to that version of ACA-Py.

I think the overhead of that would be two pyproject.toml files — one for the integration tests for the plugin, and one (perhaps a snippet of a toml file?) for deployers to use.

Is that way off?

esune commented 11 months ago

@swcurran I think the use case is the same, and relates to how plugins might affect dependencies that were installed before the plugin installation was executed, including - but not limited to - a previously installed version of ACA-Py (like in the official images).

The two ways of viewing this based on how you want to build your service:

There are challenges in both approaches, and they all are related to how dependencies will be resolved when installing a plugin in a non-empty environment regardless of whether it is a pre-build ACA-Py image, or adding a plugin to an environment where other plugins with potentially different requirements already exist.

swcurran commented 11 months ago

Hmmm…I don’t see why the plugins repo would cater to the latter. Any given ACA-Py instance, is made up of ACA-Py plus 0 or more plugins, and it’s the responsibility of the deployer to make sure that any required plugins all work at the chosen level of ACA-Py. If they have to update (or request an update to…) the plugin to get what they want — all the better for the rest of community.

The only place I can see where we should assume that the plugin always runs alone (and can therefore dictate the dependencies) is in the integration tests for the plugin.

jamshale commented 11 months ago

My thinking was similar to @swcurran. We keep this repo updated and tested every time aca-py updates and then create a tag so the main aca-py application can easily update it's plugins (or use latest, probably not a good idea).

It would be work, for sure, to keep it updated all the time. Especially when there is breaking changes. That's why having that repo management script will be handy and making sure all the plugins are tested automatically.

I just think that would be the simplest solution. It would be up to the people managing the application to also update their plugins. You could at least log a warning if the main applications agent and plugin versions differed.

dbluhm commented 11 months ago

I think I've come back around to agreeing with @esune's original proposal of specifying ACA-Py outside of the main project dependencies. I have a suspicion that there may be some hidden gotchas in that approach but I think it's worth a go.

However, a suggested amendment to @esune's original proposal: rather than specifying it as a dev dependency, can we do it as an extra? That way, if the plugin is being installed in a standalone image, the image builder can tack on the acapy extra for the plugin which will install the plugin's "preferred" version of ACA-Py from PyPI.

esune commented 11 months ago

I think I've come back around to agreeing with @esune's original proposal of specifying ACA-Py outside of the main project dependencies. I have a suspicion that there may be some hidden gotchas in that approach but I think it's worth a go.

However, a suggested amendment to @esune's original proposal: rather than specifying it as a dev dependency, can we do it as an extra? That way, if the plugin is being installed in a standalone image, the image builder can tack on the acapy extra for the plugin which will install the plugin's "preferred" version of ACA-Py from PyPI.

Yep, that seems reasonable and better than using a dev ependency - good suggestion! I also agree on the "hidden gotchas" piece, but it's one of those scenarios where we will only know if we try 😅