Closed matthewfeickert closed 2 years ago
@audrium @mvidalgarcia @tiborsimko, as you were all involved in the initial pinning of these dependencies in PR #264 and PR #270 and https://github.com/reanahub/reana-workflow-engine-yadage/issues/192, do you have any objections to the lower bound proposal? If not, I have a branch on my fork with the above change that is passing CI and is ready to PR.
@matthewfeickert The reason for pinning was to have the very same Yadage version on the REANA client side and on REANA the cluster side. The reana-workflow-engine-yadage
component obviously works with a precise yadage/adage version combination only, and we wanted to be sure to have the very same versions on the client side as well, e.g. for workflow validation.
That said, there are several possible solutions:
If adage/yadage packages use semantic versioning strictly, so that we can rely on a fact that 0.20.N will have the same API for any N, then we can perhaps pin in r-commons
in a manner like ">=0.20,<0.21" and then have a strict pin "==0.20.1" in r-w-e-yadage
only. Will this do?
Otherwise, using two different Python environments, one for RECAST and one for REANA, should solve this nicely as well, albeit in a manner that is not very convenient to users.
Otherwise another solution would be to release reana-client
as an AppImage (see https://github.com/reanahub/reana-client/issues/390) so that there would be no Python dependency clash either. This would obviously work if users aim to use reana-client
CLI API and not Python API.
The reason for pinning was to have the very same Yadage version on the REANA client side and on REANA the cluster side. The
reana-workflow-engine-yadage
component obviously works with a precise yadage/adage version combination only, and we wanted to be sure to have the very same versions on the client side as well, e.g. for workflow validation.
Thanks for the super fast reply, @tiborsimko! (Though we can of course follow up more when it isn't the weekend for you — I wasn't expecting a reply now, just getting thoughts out of my head.)
Maybe we should back up a bit first so that I understand where the REANA team is coming from. Are reana-client
and reana-commons
libraries or applications? I was under the impression they were libraries, but the above makes it sound like you're not deploying them in a custom environment with a lock file or something so I'm now unsure. It is probably best to understand how the REANA team sees them so that I don't make wrong assumptions.
That being said, with regards to the proposed solutions:
- If adage/yadage packages use semantic versioning strictly, so that we can rely on a fact that 0.20.N will have the same API for any N, then we can perhaps pin in
r-commons
in a manner like ">=0.20,<0.21" and then have a strict pin "==0.20.1" inr-w-e-yadage
only.
I can pretty confidently say that the API won't be changing in any patch releases. No one is perfect with SemVer, but @lukasheinrich and I take it pretty seriously. However, this discussion of placing upper bounds that are effectively pins and then actually pinning is again confusing if these are supposed to be libraries. So I would like to avoid any solution that imposes these sorts of restrictions in setup.py
or setup.cfg
.
- Otherwise, using two different Python environments, one for RECAST and one for REANA, should solve this nicely as well, albeit in a manner that is not very convenient to users.
This would mean that recast-atlas
would have to drop support for the reana-client
backend/integration as reana-client
would be uninstallable in the same environment (as it is now). I'd like to avoid this as much as possible.
- Otherwise another solution would be to release
reana-client
as an AppImage (see installation: non-python-venv based installation way reana-client#390) so that there would be no Python dependency clash either. This would obviously work if users aim to usereana-client
CLI API and not Python API.
I think this again comes down to library vs. application (seeming to imply application). I don't have super strong feelings here, and I also admittedly have very limited use with AppImage so some of the following questions might not make sense (apologies if so), but:
recast-atlas
is just using the reana-client
CLI API, so we don't currently need to have the Python API, but in my mind it seems nice to still have access to it (though this might be out of scope in general). @lukasheinrich probably has more intelligent/informed things to say here than I do.In general, I apologize if I'm rehashing any discussions that have been had already in the past or if I'm missing anything super obvious. My main goal is just to make sure that recast-atlas
can work with REANA, so anything that moves that forward is great. :+1:
Are reana-client and reana-commons libraries or applications?
I'd say more the latter, as far as workflow engine versions such as CWL or Snakemake or Yadage are concerned. This is because we don't support more than one Yadage version on the cluster side right now. Example: if REANA supports Yadage 0.20, but there is a lot of development in Yadage and say Yadage 0.21, 0.22 and 0.23 are born and use different standard of feature set or API each, then surely we won't be able to run such workflows on the REANA cluster. Hence it wouldn't make sense to allow these "incompatible" Yadage versions to be used in the client environment either. We cannot "promise" that such a combination would work. Hence more an app than a library.
Ideally, if we would like to turn reana-client
into a full-scale library, then the REANA cluster would have to have a way of running Yadage 0.20 workflows, Yadage 0.21 workflows, Yadage 0.22 workflows, Yadage 0.23 workflows, e.g. via different engine images loaded on the cluster side for the execution -- and then we could allow users to use any Yadage version in the client they wish and reana-client
detect which Yadage version users would like to use and instruct the cluster to "load" the appropriate version accordingly. But this is something that would require quite some development. Right now, the REANA cluster support only one CWL version, only one Snakemake version, and only one Yadage version, that are all pinned in reana-commons
.
I can pretty confidently say that the API won't be changing in any patch releases.
Great, then I think if we go for ">=0.20,<0.21" then this might be the fastest solution to the problem at hand!
As for your concern:
[...] placing upper bounds that are effectively pins
Yes, but I see it as a good thing. E.g. we did not pin click
dependency in REANA 0.7.x release series in the main repository that is having Helm charts in themaint-0.7
branches; we only used "click>=7" so the pin was without upper boundary. Now that there is Click version 8 out, and is not compatible with version 7, our CI actually does not pass anymore in maint-0.7
branches out-of-the-box. Granted, this is easy to fix should we need to do some more serious REANA 0.7.x release series work still, but for the best "automatic reproducibility of builds" in the future, having upper pins helps to keep things green. This can be important with current REANA 0.8, upcoming REANA 0.9, etc. We might have several "release series" to work with, and having them all "buildable" out of the box in the future simplifies the maintenance as well. (Alongside the "promise" reasons mentioned above which workflow versions are supported in each REANA release series.)
In my eyes, by declaring that REANA 0.8.x release series depends on Yadage 0.20.y release series, we promise we won't break anything mutually, and our builds will "just work" even when launched many years later. Then if Yadage 0.21 comes and is compatible with REANA 0.8, we can bump the upper pin as well. And if 0.22 comes and is not compatible anymore, we can release REANA 0.9. In this way the matrix which REANA version and which Yadage version are compatible may be nicely evolving together, for the benefit of both users (see the "promise" story above) and developers (see the "CI" story above).
So if we opt for semantic versioning of pins and we liberate the current ==0.20.1
hard pin into a looser >=0.20,0.21
pin, I would rather see it as a glass-half-full rather than glass-half-empty kind of situation.
AppImage [...] Is this something that the REANA team was already planning to do?
Yes, we were thinking about it, because some users had a wish to run reana-client
is some pretty old pre-existing Python environments, such as CMS open data old VM, or other users were confused when mixing with Rucio virtual environment... Having a statically linked binary would help to use reana-client
in any Python environment.
I have actually experimented with creating an AppImage for reana-client
in the past, and it was working well and gave a usable image. However I have not polished it for the public release. But we could revive this plan now... It would solve nicely the reana-client
CLI API issue, but not the Python API issue though.
Hi. :wave: For
reana-client
v0.8.0
theinstall_requires
requiresreana-commons[yadage,snakemake]>=0.8.0
and
reana-commons
v0.8.0
has all the dependencies for the'yadage'
extra pinned.https://github.com/reanahub/reana-commons/blob/352562345a68c2c391b786ec6ebae973fe64f65d/setup.py#L36
So there is no way to install
reana-client>=0.8.0
at the moment without also requiring these pinned versions.This is problematic as
adage
v0.10.2
, andyadage-schemas
v0.10.7
were made as part of a collection of patch releases so that a patch releasev0.1.9
ofrecast-atlas
could be made. However, these pinned versions means that there is no compatible way forpip
to solve an install that asks for both a newrecast-atlas
andreana-client
— c.f. https://github.com/recast-hep/recast-atlas/pull/86.Can the dependencies for the
yadage
extra please be changed to lower bounds only?(cc @lukasheinrich)