Closed nsoranzo closed 5 years ago
Right now this is going to break a small but significant portion of tools. IMO we should not make an announcement before we have a route for executing generating command lines for legacy tools in a python 2 sandbox (or other workarounds).
+1 though for announcing a target release that should do this.
On a similar note, the latest release of numpy (1.16) dropped support for Python 3.4, we should do the same for release_19.05 , and also announce this in the 19.01 release notes.
Also msgpack dropped support for Python 3.4 some time ago.
Right now this is going to break a small but significant portion of tools. IMO we should not make an announcement before we have a route for executing generating command lines for legacy tools in a python 2 sandbox (or other workarounds).
👍. I don't know that we can announce a deprecation time before we recommend people start upgrading and I don't think Nate and James would be happy recommending Python 3 until we have sandboxed Python 2 evaluation of tool command lines.
Can you name these tools? It's difficult to reason about this otherwise.
bowtie2 for starters, see https://github.com/galaxyproject/tools-iuc/pull/2168
The thing is that we don't know for sure until we start testing. As far as I remember I only had to update the history import/export tool of all the tools shipped by Galaxy, but the IUC tools are on average more complex than what we ship with Galaxy, so I think the percentage might be pretty high.
Thanks @mvdbeek. The problem with pointing a tool is that you will fix it 😄, it is a good problem to have for sure but I don't think simply fixing new versions of all the public tools will be considered sufficient. It probably would be for me, but I tend to be more pragmatic on these issues. I think the two people I mentioned plus say Dan for instance will want private tools and older versions of tools to still work properly. Next team meeting I can try to pin people down and see if we can find some consensus though.
Totally agree there, if you need to check and old analysis you need to check an old analysis. But it's also a question of time passed. The older the analysis the less likely I need to come back, so the sooner we verify python 3 works for widely used tools the sooner we can drop python 2 support (or whatever necessary workaround)
From https://github.com/galaxyproject/galaxy/pull/7308#issuecomment-464768519 :
I think this is too late to include such a commitment in our release notes without a broad consensus amongst the committers.
@martenson Agreed that we have to find a consensus. On the other hand, https://pythonclock.org/ reminds us that we are 10 months and 13 days away from Python 2.7 retirement, so I'd expect a greater engagement on the issue, especially from people supposedly against the deprecation notice.
I see 3 options:
I think option 2 is a reasonable compromise that will give deployers the advance notice many will need.
@nsoranzo does your timeline mean that we add now (19.01) and fix before next release (19.05)?
Yes, hopefully. We would now announce that Galaxy 20.01 will be the latest release supporting Python 2.7, or something like that.
- tools may have issues when running Galaxy under Python3, which we will fix before next release (assign a Core Team member to this).
Just to clarify, the core team member would implement a python 2 sandbox ? That is a pretty big effort, I think. Just fixing the tools isn't enough IMO.
Yes, hopefully. We would now announce that Galaxy 20.01 will be the latest release supporting Python 2.7, or something like that.
As a developer I would of course love this, but it doesn't seem realistic. We wouldn't do this to admins I assume - and if we're going to we need to put a lot of resources into updating infrastructure (ansible, docker, etc..) that I don't see us doing right now? We started a conversation on dropping 2.6 support 3 years after EOL (https://github.com/galaxyproject/galaxy/issues/1596). I would hope we can drop support for 2.7 quicker, but I'm not sure I see Jan 1st 2020 as a terribly pressing or realistic deadline.
https://utcc.utoronto.ca/~cks/space/blog/python/Python2AndLTSLinuxes
because there are other long term Linux distributions that are already committed to supporting systems with Python 2 past 2020 (for example, Red Hat Enterprise Linux 7, which will be supported through 2024, and then there's Ubuntu 18.04 itself, which will be supported through 2023).
How about we announce that deprecation (and thus preferred use of 3.5+) begins in 19.05 and dropping support in 20.01? As long as we have a Python 2 tool sandbox by 19.05, I think this is ok. We should be able to get the Ansible stuff updated by 19.05. I see no reason not to try to hit/beat the 2.7 EOL date.
How about we announce that deprecation (and thus preferred use of 3.5+) begins in 19.05 and dropping support in 20.01? As long as we have a Python 2 tool sandbox by 19.05, I think this is ok.
Fine, this is good news but I do think we should have the sandbox implemented before announcing.
I see no reason not to try to hit/beat the 2.7 EOL date.
+1
I do think we should have the sandbox implemented before announcing.
That's why I suggested a "preannouncement" of "it will be deprecated next release, at which point we expect everything to work." Just so people are aware now that they should be thinking about transitioning.
We shouldn't promise a sandbox... we should implement a sandbox and then send a note about it to -dev or whatever and announce in the following release. I don't like promising things pretty much ever but especially when we are unsure of the scope, feasibility, performance, and we don't even have resources assigned for the task.
Just to clarify, the core team member would implement a python 2 sandbox ? That is a pretty big effort, I think. Just fixing the tools isn't enough IMO.
Yes, that's why I think it's something only the Core Team can afford to develop.
I would hope we can drop support for 2.7 quicker, but I'm not sure I see Jan 1st 2020 as a terribly pressing or realistic deadline.
Unfortunately it's not just Python, but an important part of the ecosystem, as mentioned in the issue description: https://python3statement.org/#sections40-timeline
I don't like promising things pretty much ever but especially when we are unsure of the scope, feasibility, performance, and we don't even have resources assigned for the task.
Fair 'nuff.
BTW, for my purposes, conda deps of old tools requiring Python 2 will continue to work, and even older pre-conda versions can just use my hand-installed Python 2, I just need a way to configure that.
Unfortunately it's not just Python, but an important part of the ecosystem, as mentioned in the issue description: https://python3statement.org/#sections40-timeline
Certainly this will become a serious problem for us over time, but realistically I think will be okay for some time. Our dependency situation is much better than it was 3 years ago - we're much more updated to the latest versions of all our dependencies and we still haven't hit this in a serious way. This will become an increasingly strong counter pressure against continuing to maintain 2.7 support, but my thinking based on our Python 2.6 experience is we should wait until we feel the pressure rather than assume it is coming and going to hit us hard. I'm more excited to write Python 3-only code than I'm worried we will be staring a Python 3-only dependency that we will need to adopt.
It's the cheetah templating that can make use of HDA methods or the wrapped variant thereof that I think will be difficult to deal with, not the dependencies per se.
This will become an increasingly strong counter pressure against continuing to maintain 2.7 support, but my thinking based on our Python 2.6 experience is we should wait until we feel the pressure rather than assume it is coming and going to hit us hard.
I don't think 2.6 and 2.7 deprecations can be fairly compared, 2.7 being the last remaining release of the 2.x series.
Would like to add a +1 to committing to as aggressive a timeline as possible for python 2.7's departure. Mainly because many of the packages we're involved in maintaining support python 2.7 only because Galaxy has not made a decisive move yet. For example, cloudbridge's tests alone would save 1 hour sans py27. I'm sure there would be similar ripple effects throughout the ecosystem, including on ansible playbooks etc. This would save us a lot of time indirectly, hence why personally, I'd like to see a more aggressive timeline.
Using my nearly infinite powers as release manager to push this off on to the 19.05 release manager. Sorry for the people who are disappointed by this.
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
We discussed with @natefoo a bit which spawned this idea:
Could we capture the most common operations/constructs that are used in tools' Py2 template that are incompatible with Py3 and create a pre-compile step for the template that will patch them on the fly?
And then avoid the sandbox, which looks like a much bigger solution for something that we want to only have in place temporarily. New instances of Galaxy do not even want to have the sandbox or the on-the-fly patching. They just want new patched tools.
natefoo says:
Martin and I just had a big discussion about the Cheetah sandboxing and I'm unsure of whether it's actually less work than 1. developing a way to fix old versions of tools in the TS and 2. Making all versions of supported IUC and devteam tools 2+3 compatible. Because the sandbox seems to me like a massive amount of work considering we'd need to serialize stuff like
DatasetFilenameWrapper
and anything that its methods might need, etc. And$__app__
we can't possibly support in the sandbox beyond maybe a few commonly used attributes.Alternatively we could make accessing objects in the sandbox interface with Galaxy via some kind of private API, maybe?
we want to only have in place temporarily
I think it's worth noting that our solution actually needs to be semi-permanent. New tools should be written with syntax that is valid for both 2 and 3 (and maybe a tool profile version that is 3-only). But the problem here is old versions of tools. At present we have no proper way to fix them.
I don't expect every tool that I installed back in 2010 to still be working in 2025. And at some point we have to limit our efforts to keep ancient versions of tools working. But especially since the conda era, most tools can potentially work for a long time, we don't want to intentionally do something to break them if we can help it.
One of the first things to do is to assess what portion of tools is relying on py2. We could start an instance of py3 Galaxy, install all of IUC & devteam tools (maybe even all revisions?) and run tool tests.
I could just update test.galaxyproject.org to use Python 3. It doesn't have the exact same list of tools at the exact same versions as Main, but it'd be a start.
That is what we discussed last month on the committer call, right ? I do think the sandbox is implementable though, especially since @jmchilton put in some work to make some parts of the model serializable. We'd probably need to be able to serialize tool parameters as well, but again that seems doable. In any case ephemeris has seen some nice updates to remote tool testing, if we get test.galaxyproject.org running on python 3 we can surely check the status of the current tools. We'll have to do this anyway
and maybe a tool profile version that is 3-only
💯
That is what we discussed last month on the committer call, right ?
I think so. We put it on hold for test.* because we did not want to add tasks for Nate, but now when he volunteered...😎
Test is now running on Python 3.6, but handlers won't start (follow along here: galaxyproject/usegalaxy-playbook#206).
Wondering if there's any update here?
@dhalperi What would you like to know exactly? Galaxy now works fine under Python3, but some old tools may break.
Any plan to deprecate/remove support for Python 2?
@dhalperi have you read this issue?
Yes I read this issue. It was last updated 4 months ago with "we plan to do some stuff and mark it deprecated in the (jan, may, sept) releases". Additionally, it's clear from the prior comments that much of the communication happens out of band and may not be reflected here.
Since it's been four months, asking if there's an update on the plans.
@dhalperi I think the consensus is currently to deprecate soon, probably this release (19.09).
My take on future of the py2 support is that we will probably support it until the very end, i.e. until some of our dependencies' py2 version becomes a security risk/unstable. This is still under a debate though.
In the meantime there has been a lot of effort put into supporting py3, if I was to set up a production Galaxy instance now I would use py3.
Thank you very much :)
A lot of Galaxy dependencies are in the process of dropping support for Python 2, see e.g. https://python3statement.org/
Even
pip
now (since release 19.0) displays the following warning:If we keep supporting Python 2.7, we will have to pin dependencies to old and potentially broken/insecure versions.
I suggest we add a deprecation notice to the release notes (for 19.01 if possible), specifying which Galaxy release is going to be the last supporting Python 2.7 .
We should also start moving our test servers (e.g. https://test.galaxyproject.org/ ) to Python 3, to iron the last bugs. This may require the update of some Ansible playbooks and roles.
xref. #1715