galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.41k stars 1.01k forks source link

Add deprecation notice for Python 2.7 support to release notes #7260

Closed nsoranzo closed 5 years ago

nsoranzo commented 5 years ago

A lot of Galaxy dependencies are in the process of dropping support for Python 2, see e.g. https://python3statement.org/

Even pip now (since release 19.0) displays the following warning:

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.

If we keep supporting Python 2.7, we will have to pin dependencies to old and potentially broken/insecure versions.

I suggest we add a deprecation notice to the release notes (for 19.01 if possible), specifying which Galaxy release is going to be the last supporting Python 2.7 .

We should also start moving our test servers (e.g. https://test.galaxyproject.org/ ) to Python 3, to iron the last bugs. This may require the update of some Ansible playbooks and roles.

xref. #1715

mvdbeek commented 5 years ago

Right now this is going to break a small but significant portion of tools. IMO we should not make an announcement before we have a route for executing generating command lines for legacy tools in a python 2 sandbox (or other workarounds).

mvdbeek commented 5 years ago

+1 though for announcing a target release that should do this.

nsoranzo commented 5 years ago

On a similar note, the latest release of numpy (1.16) dropped support for Python 3.4, we should do the same for release_19.05 , and also announce this in the 19.01 release notes.

Also msgpack dropped support for Python 3.4 some time ago.

jmchilton commented 5 years ago

Right now this is going to break a small but significant portion of tools. IMO we should not make an announcement before we have a route for executing generating command lines for legacy tools in a python 2 sandbox (or other workarounds).

👍. I don't know that we can announce a deprecation time before we recommend people start upgrading and I don't think Nate and James would be happy recommending Python 3 until we have sandboxed Python 2 evaluation of tool command lines.

nsoranzo commented 5 years ago

Can you name these tools? It's difficult to reason about this otherwise.

mvdbeek commented 5 years ago

bowtie2 for starters, see https://github.com/galaxyproject/tools-iuc/pull/2168

mvdbeek commented 5 years ago

The thing is that we don't know for sure until we start testing. As far as I remember I only had to update the history import/export tool of all the tools shipped by Galaxy, but the IUC tools are on average more complex than what we ship with Galaxy, so I think the percentage might be pretty high.

jmchilton commented 5 years ago

Thanks @mvdbeek. The problem with pointing a tool is that you will fix it 😄, it is a good problem to have for sure but I don't think simply fixing new versions of all the public tools will be considered sufficient. It probably would be for me, but I tend to be more pragmatic on these issues. I think the two people I mentioned plus say Dan for instance will want private tools and older versions of tools to still work properly. Next team meeting I can try to pin people down and see if we can find some consensus though.

mvdbeek commented 5 years ago

Totally agree there, if you need to check and old analysis you need to check an old analysis. But it's also a question of time passed. The older the analysis the less likely I need to come back, so the sooner we verify python 3 works for widely used tools the sooner we can drop python 2 support (or whatever necessary workaround)

nsoranzo commented 5 years ago

From https://github.com/galaxyproject/galaxy/pull/7308#issuecomment-464768519 :

I think this is too late to include such a commitment in our release notes without a broad consensus amongst the committers.

@martenson Agreed that we have to find a consensus. On the other hand, https://pythonclock.org/ reminds us that we are 10 months and 13 days away from Python 2.7 retirement, so I'd expect a greater engagement on the issue, especially from people supposedly against the deprecation notice.

I see 3 options:

I think option 2 is a reasonable compromise that will give deployers the advance notice many will need.

martenson commented 5 years ago

@nsoranzo does your timeline mean that we add now (19.01) and fix before next release (19.05)?

nsoranzo commented 5 years ago

Yes, hopefully. We would now announce that Galaxy 20.01 will be the latest release supporting Python 2.7, or something like that.

mvdbeek commented 5 years ago
  • tools may have issues when running Galaxy under Python3, which we will fix before next release (assign a Core Team member to this).

Just to clarify, the core team member would implement a python 2 sandbox ? That is a pretty big effort, I think. Just fixing the tools isn't enough IMO.

jmchilton commented 5 years ago

Yes, hopefully. We would now announce that Galaxy 20.01 will be the latest release supporting Python 2.7, or something like that.

As a developer I would of course love this, but it doesn't seem realistic. We wouldn't do this to admins I assume - and if we're going to we need to put a lot of resources into updating infrastructure (ansible, docker, etc..) that I don't see us doing right now? We started a conversation on dropping 2.6 support 3 years after EOL (https://github.com/galaxyproject/galaxy/issues/1596). I would hope we can drop support for 2.7 quicker, but I'm not sure I see Jan 1st 2020 as a terribly pressing or realistic deadline.

https://utcc.utoronto.ca/~cks/space/blog/python/Python2AndLTSLinuxes

because there are other long term Linux distributions that are already committed to supporting systems with Python 2 past 2020 (for example, Red Hat Enterprise Linux 7, which will be supported through 2024, and then there's Ubuntu 18.04 itself, which will be supported through 2023).

natefoo commented 5 years ago

How about we announce that deprecation (and thus preferred use of 3.5+) begins in 19.05 and dropping support in 20.01? As long as we have a Python 2 tool sandbox by 19.05, I think this is ok. We should be able to get the Ansible stuff updated by 19.05. I see no reason not to try to hit/beat the 2.7 EOL date.

jmchilton commented 5 years ago

How about we announce that deprecation (and thus preferred use of 3.5+) begins in 19.05 and dropping support in 20.01? As long as we have a Python 2 tool sandbox by 19.05, I think this is ok.

Fine, this is good news but I do think we should have the sandbox implemented before announcing.

martenson commented 5 years ago

I see no reason not to try to hit/beat the 2.7 EOL date.

+1

natefoo commented 5 years ago

I do think we should have the sandbox implemented before announcing.

That's why I suggested a "preannouncement" of "it will be deprecated next release, at which point we expect everything to work." Just so people are aware now that they should be thinking about transitioning.

jmchilton commented 5 years ago

We shouldn't promise a sandbox... we should implement a sandbox and then send a note about it to -dev or whatever and announce in the following release. I don't like promising things pretty much ever but especially when we are unsure of the scope, feasibility, performance, and we don't even have resources assigned for the task.

nsoranzo commented 5 years ago

Just to clarify, the core team member would implement a python 2 sandbox ? That is a pretty big effort, I think. Just fixing the tools isn't enough IMO.

Yes, that's why I think it's something only the Core Team can afford to develop.

I would hope we can drop support for 2.7 quicker, but I'm not sure I see Jan 1st 2020 as a terribly pressing or realistic deadline.

Unfortunately it's not just Python, but an important part of the ecosystem, as mentioned in the issue description: https://python3statement.org/#sections40-timeline

natefoo commented 5 years ago

I don't like promising things pretty much ever but especially when we are unsure of the scope, feasibility, performance, and we don't even have resources assigned for the task.

Fair 'nuff.

natefoo commented 5 years ago

BTW, for my purposes, conda deps of old tools requiring Python 2 will continue to work, and even older pre-conda versions can just use my hand-installed Python 2, I just need a way to configure that.

jmchilton commented 5 years ago

Unfortunately it's not just Python, but an important part of the ecosystem, as mentioned in the issue description: https://python3statement.org/#sections40-timeline

Certainly this will become a serious problem for us over time, but realistically I think will be okay for some time. Our dependency situation is much better than it was 3 years ago - we're much more updated to the latest versions of all our dependencies and we still haven't hit this in a serious way. This will become an increasingly strong counter pressure against continuing to maintain 2.7 support, but my thinking based on our Python 2.6 experience is we should wait until we feel the pressure rather than assume it is coming and going to hit us hard. I'm more excited to write Python 3-only code than I'm worried we will be staring a Python 3-only dependency that we will need to adopt.

mvdbeek commented 5 years ago

It's the cheetah templating that can make use of HDA methods or the wrapped variant thereof that I think will be difficult to deal with, not the dependencies per se.

nsoranzo commented 5 years ago

This will become an increasingly strong counter pressure against continuing to maintain 2.7 support, but my thinking based on our Python 2.6 experience is we should wait until we feel the pressure rather than assume it is coming and going to hit us hard.

I don't think 2.6 and 2.7 deprecations can be fairly compared, 2.7 being the last remaining release of the 2.x series.

nuwang commented 5 years ago

Would like to add a +1 to committing to as aggressive a timeline as possible for python 2.7's departure. Mainly because many of the packages we're involved in maintaining support python 2.7 only because Galaxy has not made a decisive move yet. For example, cloudbridge's tests alone would save 1 hour sans py27. I'm sure there would be similar ripple effects throughout the ecosystem, including on ansible playbooks etc. This would save us a lot of time indirectly, hence why personally, I'd like to see a more aggressive timeline.

jmchilton commented 5 years ago

Using my nearly infinite powers as release manager to push this off on to the 19.05 release manager. Sorry for the people who are disappointed by this.

snahil commented 5 years ago

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.

martenson commented 5 years ago

We discussed with @natefoo a bit which spawned this idea:

Could we capture the most common operations/constructs that are used in tools' Py2 template that are incompatible with Py3 and create a pre-compile step for the template that will patch them on the fly?

And then avoid the sandbox, which looks like a much bigger solution for something that we want to only have in place temporarily. New instances of Galaxy do not even want to have the sandbox or the on-the-fly patching. They just want new patched tools.

natefoo says:

Martin and I just had a big discussion about the Cheetah sandboxing and I'm unsure of whether it's actually less work than 1. developing a way to fix old versions of tools in the TS and 2. Making all versions of supported IUC and devteam tools 2+3 compatible. Because the sandbox seems to me like a massive amount of work considering we'd need to serialize stuff like DatasetFilenameWrapper and anything that its methods might need, etc. And $__app__ we can't possibly support in the sandbox beyond maybe a few commonly used attributes.

Alternatively we could make accessing objects in the sandbox interface with Galaxy via some kind of private API, maybe?

natefoo commented 5 years ago

we want to only have in place temporarily

I think it's worth noting that our solution actually needs to be semi-permanent. New tools should be written with syntax that is valid for both 2 and 3 (and maybe a tool profile version that is 3-only). But the problem here is old versions of tools. At present we have no proper way to fix them.

I don't expect every tool that I installed back in 2010 to still be working in 2025. And at some point we have to limit our efforts to keep ancient versions of tools working. But especially since the conda era, most tools can potentially work for a long time, we don't want to intentionally do something to break them if we can help it.

martenson commented 5 years ago

One of the first things to do is to assess what portion of tools is relying on py2. We could start an instance of py3 Galaxy, install all of IUC & devteam tools (maybe even all revisions?) and run tool tests.

natefoo commented 5 years ago

I could just update test.galaxyproject.org to use Python 3. It doesn't have the exact same list of tools at the exact same versions as Main, but it'd be a start.

mvdbeek commented 5 years ago

That is what we discussed last month on the committer call, right ? I do think the sandbox is implementable though, especially since @jmchilton put in some work to make some parts of the model serializable. We'd probably need to be able to serialize tool parameters as well, but again that seems doable. In any case ephemeris has seen some nice updates to remote tool testing, if we get test.galaxyproject.org running on python 3 we can surely check the status of the current tools. We'll have to do this anyway

mvdbeek commented 5 years ago

and maybe a tool profile version that is 3-only

💯

martenson commented 5 years ago

That is what we discussed last month on the committer call, right ?

I think so. We put it on hold for test.* because we did not want to add tasks for Nate, but now when he volunteered...😎

natefoo commented 5 years ago

Test is now running on Python 3.6, but handlers won't start (follow along here: galaxyproject/usegalaxy-playbook#206).

dhalperi commented 5 years ago

Wondering if there's any update here?

nsoranzo commented 5 years ago

@dhalperi What would you like to know exactly? Galaxy now works fine under Python3, but some old tools may break.

dhalperi commented 5 years ago

Any plan to deprecate/remove support for Python 2?

martenson commented 5 years ago

@dhalperi have you read this issue?

dhalperi commented 5 years ago

Yes I read this issue. It was last updated 4 months ago with "we plan to do some stuff and mark it deprecated in the (jan, may, sept) releases". Additionally, it's clear from the prior comments that much of the communication happens out of band and may not be reflected here.

Since it's been four months, asking if there's an update on the plans.

martenson commented 5 years ago

@dhalperi I think the consensus is currently to deprecate soon, probably this release (19.09).

My take on future of the py2 support is that we will probably support it until the very end, i.e. until some of our dependencies' py2 version becomes a security risk/unstable. This is still under a debate though.

In the meantime there has been a lot of effort put into supporting py3, if I was to set up a production Galaxy instance now I would use py3.

dhalperi commented 5 years ago

Thank you very much :)

martenson commented 5 years ago

closed via https://github.com/galaxyproject/galaxy/pull/8763/commits/7efbd043ede6e5fbb72bb445aa5d8a2cc6841fb5