Closed hjoliver closed 5 years ago
Once we get on to this can we go drop Python-2 support, or will we need to support both for a while?
At the moment, I think most distros include both python2
and python3
commands these days. When we port cylc to Python 3, we can modify our #!
lines to run python3
, then we should no longer have to support both.
Most new distros...
new should no longer be that new by the time we tackle this issue.
[meeting]
I've just started trying to use Cylc and while I appreciate that not everywhere is ready to move to Python 3 completely, not supporting it at all is a bit of a pain. Would it be much work to make it version agnostic, using six where necessary?
It would also make the transition smoother for those still on 2...
@duncanwp - six looks useful. We're very open to contributions if you want to take a crack at this yourself! Otherwise we'll definitely get to it in due course, but it has not become a top priority yet for the reasons above.
Hi @duncanwp Can you explain why it is a pain, please? What platform are you using cylc on?
We do want to port cylc to Python 3 eventually, but it is not a priority at the moment, as our main user bases don't have proper access to Python 3 yet.
@matthewrmshin I'm looking to use it on a Centos 6 cluster (which is Python 2 based) but I run my own conda (Python 3) environment to install a number of packages which aren't pre-installed. I can work around it, I've just started getting used to the Python 3 niceness ;-)
@hjoliver I'll have a look at what's involved and maybe put together a pull-request.
@matthewrmshin Which actually reminds me of another question: Are you likely to create a conda package for Cylc, possibly through conda-forge?
Hi @duncanwp As a quick workaround, you may try:
Only use your conda environment in applications that require the special Python 3 environment. This will ensure that any other applications implemented in Python 2 (like cylc) do not break.
Alternatively, you can try modifying all #!/usr/bin/env python
lines in your local installation of Cylc to use #!/usr/bin/python
to ensure that it is not breakable by the version of Python in your current environment.
Are you likely to create a conda package for Cylc, possibly through conda-forge?
Yes, but not a priority at the moment.
Hi all, I'm more than happy to work on this a bit. Anyone who is having trouble getting Python 3 into a legacy environment should consider using the Anaconda distribution to get in on there -- that's what we're currently planning to do. I have found not all Linux environments are necessarily packaging PyGTK for Python 2 any more, so there is an increasing need to at least support Python 3.
The main stumbling block I see for packaging Cylc itself with conda is the PyGTK dependency. It seems to have a very strange status, in that it doesn't actually install properly through pip on Linux at all, despite GTK being a Linux library! It might be more tractable to package a "no gui" Cylc through conda initially, and then circle back on the PyGTK support.
@tleeuwenburg - that'd be good, bearing in mind the comments above on the need to support both Python-2 and -3 for a while from the same evolving Cylc code base - might need to use six
as suggested above
I am moving the milestone to later for this issue - as it will become one of our main priorities in the not-so-distant future. However, it is worth noting that we are still unlikely to look at this until mid-2019 earliest when we are done with #1873.
Had some spare time today, and was a bit tired of Java, so started syncing my local cylc repository. :tada: Lots of interesting things!
All the projects I've migrated from 2 to 3 were smaller, and did no required compatibility with Python 2. So never used 2to3
, six
, etc.
Here's an initial version that works at least for the my.suite
example. Tests fails, but also fail in my environment building from master. Perhaps somebody would have some spare time to review & maybe try it with a better test scenario?
https://github.com/cylc/cylc/compare/master...kinow:p2to3-take1
Here's the contents of a notepad I used for notes:
# quickly fix some files
2to3 -w -n .
# commit #1
# load anaconda python 2 with cylc modules installed
source ~/Development/python/sa2.sh
# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;
# issue with print("", file=sys.stderr) (file=...)
# @see http://python-future.org/compatible_idioms.html
# manually patch the files on sublime
grep -r -H -e "print(.*file=" . | awk '{print $1}' | sort | uniq | xargs -I{} subl {}
# commit #2
# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;
# issue with class PointBase(object, metaclass=ABCMeta): (metaclass=...)
# s/object, metaclass=ABCMeta):/six.with_metaclass(ABCMeta, object)):/g
# @see http://portingguide.readthedocs.io/en/latest/classes.html
subl ./lib/cylc/cycling/__init__.py
# commit #3
# test Python 2
find . -name "*.py" -exec python2 -m py_compile {} \;
# issue with urllib ... cherrypy's _cpcompat.py has a good try/catch...
# @see http://python3porting.com/stdlib.html
grep -r -H "urllib" . | grep "\.py" | awk '{print $1}' | grep -v "Binary" | grep -v "cherrypy" | grep -v "INSTALL" | sort | uniq | xargs -I{} subl {}
# FIXME bug in _compat.py?
Was
try:
from urllib.parse import quote_from_bytes as url_quote
except ImportError:
from urllib.parse import quote as url_quote
Is now
try:
from urllib.parse import quote_from_bytes as url_quote
except ImportError:
from urllib import quote as url_quote
# issue with izip
grep -r -H "izip" | grep "\.py" | grep -v Binary
# issue with imap
grep -r -H "imap" | grep "\.py" | grep -v Binary
# Undo changes done by 2to3 in libraries
git checkout origin lib/jinja2
git checkout origin lib/cherrypy
# implement missing __nonzero__ abstract method?
# -- OK! --
# load anaconda python 3 with cylc modules installed
source ~/Development/python/sa3.sh
# test Python 3
find . -name "*.py" -exec python3 -m py_compile {} \;
# -- OK! --
# unit tests et static analysis?
pip install pycodestyle pyopenssl
export PATH="$PWD/bin:$PATH"
export CYLC_TEST_DIFF_CMD='diff -I Xlib -u' # I have an X server, but harmless?
export CYLC_TEST_RUN_PLATFORM=false
export CYLC_TEST_DEBUG=true
export PATH=$PATH:./bin
cylc test-battery --state=save -j 5 || (echo -e "\n\nRerunning Failed Tests...\n\n"; cylc test-battery --state=save,failed -j 5)
A few tests started failing, and I realized I had spent too much time on this, so decided to comment here :-)
Hey @kinow - that seems like good news, maybe the port will not be so difficult! FYI we just discussed this yesterday (had not updated the issue yet) and decided we probably will not support Python 2 and 3 at the same time, just make a clean break when the time comes (those stuck with Python 2 - if any - will just have to use older cylc versions for a while). Note the blocker for actually doing this is the old Python 2 PyGTK GUI - first big priority is web GUI, then Python 3.
Hey @hjoliver
I looked at the existing issues, and many of them are related to the current GUI. I think moving to the web GUI would be a great milestone for cylc :tada:
And if we won't need to support both, then it might be even easier. For me the hardest part is making sure cylc is still running fine after the change.
@cylc/core - technically we could port the suite server program and command line clients to Python 3 any time, leaving just the old GUI to run under Python 2. I was thinking it made more sense to go all at once (with the new GUI in place) but if it turned out to be pretty easy (see @kinow's post above) there would be a "reputational" advantage to having the bulk of the code modernized.
leaving just the old GUI to run under Python 2
It is possible but may be a bit tricky. Take #164 for instance where the GUI manipulates the suite configuration to hold visualisation settings.
As an aside we would probably have to package and distribute the old GUI separately. Here is a list of Python modules which the GUI imports from outside of the GUI and standard libraries:
cylc.cfgspec.gcylc cylc.cfgspec.glbl_cfg cylc.cfgspec.gscan cylc.cylc_xdot cylc.dump cylc.flags cylc.graphing cylc.hostuserutil cylc.mkdir_p cylc.network.httpclient cylc.network.port_scan cylc.run_get_stdout cylc.suite_logging cylc.suite_srv_files_mgr cylc.suite_status cylc.task_id cylc.task_job_logs cylc.task_state cylc.task_state_prop cylc.version cylc.wallclock isodatetime.data isodatetime.parsers xdot
I don't think #164 matters for this does it? (that's just the complaint that server stores client graph collapse choice - unfortunately).
Still, I had imagined the list of common modules was quite a bit shorter!! Fair enough, not such a good idea :grimacing:
In general I think, Oliver's comment aside, it would be better to stick with the plan to port after we have developed the new GUI (at least to some functional state). Doing so in parallel would be a distraction to the GUI work, which we should focus on to get the best results, & we would have to ensure we have an agreed porting strategy & an accessible master list of Cylc Python modules in the process of being ported by someone, so that effort is not duplicated.
I generally agree, just thought the porting job might actually be quite easy - but some of the common modules listed by @oliver-sanders are a nasty complication for sure.
I also had the impression it would be easier to investigate first making it compatible with Python 2 and 3. Probably with the six
module, and perhaps a compatibility layer (e.g. Pandas compat package).
But I hadn't thought about the GUI until @oliver-sanders ' comment. Normally I use Anaconda environments for Python, except for cylc gui
, which I am still running on my Ubuntu Python, in order to be able to import the gtk
module... will have a look over the next days to understand more about it.
I agree that doing both in parallel would be a distraction. However, i am not sure if implementing a new web layer with Flask/Tornado/etc in Python 2 would work without any hiccups. The documentation for both Flask and Tornado state that Python 2 is still supported, but that it's preferred to use the latest version of Python 3.
Quickly looking at the issues in Flask, it's possible to find a few minor issues regarding issues with Python 2, e.g.
Tornado's current release version is 5.x, but they have already announced 6.x will drop Python 2. Their current Travis CI configuration doesn't even seem to include Python 2 any longer.
I think we could run the risk of having to use an older version, or workaround limitations in the frameworks due to Python 2.
If we measure the test coverage, and include any necessary tests to reach a reasonable coverage, plus some manual tests, then maybe we could (assuming it's possible to run the GUI in Python 3, of course)
print "hello"
to print("hello")
)six
From a previous comment, sounds like there was already a plan with order of activities, and that in that plan Python 3 would come after the web layer. If that's a hard requirement, then all good for me. But otherwise, I would be happy to increase my dosage of caffeine and have a crack at it :man_dancing:
Cheers Bruno
There is no hard requirement for the order of activities. The main issue has always been the lack of developers and that we do want to draw a close to some of the issues that we have promised to solve. And I'm still trying to see all current PRs merged before doing anything further.
I agree with the analysis that writing new web layer code in Python 2 sounds crazy - and this activity will have to start soon!
I'm just thinking that we may end up having to maintain a development branch for a prolonged period of time - with code that can be upgraded ported to Python 3 and web UI development built on top of this branch.
In which case, perhaps feel free to take a crack at this, but expect to have to maintain your branch for a period of time (like one of my PRs) before it gets the full attention by the rest of the team.
Presumably the new reverse proxy can be done in Python 3 from the outset, because it only interacts with the suites servers via the network API. But for the suite server side changes, yes we'll need to figure this out ...
In which case, perhaps feel free to take a crack at this, but expect to have to maintain your branch for a period of time (like one of my PRs) before it gets the full attention by the rest of the team.
That works for me @matthewrmshin . Happy to rebase & fix the code. If things go well, and it's not an epic task to make the code compatible with both versions, then I expect to have a few patterns of code that can be easily migrated/adjusted.
When you do come to do the port, python-modernize and python-future are two tools that can automate making a lot of the easy changes for you (like print 1
-> print(1)
). They're similar to 2to3, but they aim to keep the code working on Python 2 as well. I just scanned through the thread and haven't seen them mentioned, so I thought a link might be useful.
Of those two, modernize is more conservative in its changes, and adds calls to six
where needed. Python-future's futurize
tool aims to make a more idiomatic Python 3 codebase, and inserts its own compatibility layer to make things work on Python 2.
@takluyver thanks for the tips! I have a branch where I did the following (and seems to be going in a good direction):
For each package, first I am running locally and printing to the console, without changing the files. Then I analyzed the output to see if everything made sense... I think there was only one or two parts that I had to open another terminal and play with the code to see why something was done the way it was (so many list(...)
calls).
My only concern right now is that there are a few parts of Cylc not covered by tests, and even though python-future seems to be doing a good job, there's still the risk that we may have issues later I think.
Perhaps modernize would be simpler. I jumped directly from six
and 2to3
to futurize
. Will have a look at modernize as well!
Thanks!
Great, sounds like you're already on top of that. Modernize vs futurize is largely a matter of taste, I think: you'll still have to do similar checking with either.
The only suggestion I might have is to let futurize write its changes, and then examine them using git. Lots of GUI diff tools (like Meld) and IDEs can highlight changes from git for you. I'd find that easier than examining the patches in a terminal, but YMMV. :smiley:
The only suggestion I might have is to let futurize write its changes, and then examine them using git.
Here's my current workflow:
> cd $CYLC_HOME
> futurize lib/markupsafe/
> # here I compare the output, and if I find anything weird or that I don't understand, then I fire up python to play with some code. If everything looks good...
> futurize -n -w lib/markupsafe
> git diff # or gitk to have a more visual output
Then finish with git commit -am "futurize $module_name"
, and push to my branch.
Gotcha. I was suggesting skipping the first call to futurize
, and doing all the checking after it writes the files. Up to you, though!
@takluyver - thanks from me also, for the excellent tips. Note (not sure this is mentioned above) we've made the decision to go full-Python 3 on a long-lived development branch (largely because the old GUI can't be ported, but also some other reasons) for the new cylc web architecture. So we don't need to bother with six
etc to support both 2 and 3. Hopefully that will simplify the process a bit.
`The next question is what will be the new minimum Python version for Cylc, here is a fairly random smattering of some of the more significant features introduced since 3.0 which might be of interest to us:
3.1
collections.OrderedDict
, collections.Counter
str.format
changesunittest
improvements, incl ability to skip tests3.2
argparse
(PEP 389)concurrent.futures
(PEP 3148).pyc
files names extended with the python interpreter (PEP 3147)3.3
virtualenv
(PEP 405)os
and io
exceptions (PEP 3151)raise Exception() from None
(PEP 409)3.4
PYTHONPATH
etc (bpo 16499)ascyncio
(PEP 3156)enum
(PEP 435)3.5
async
and await
syntax (PEP 492)collections.OrderedDict
is now implemented in C3.6
collections.OrderedDict
implementation (CPython)asyncio
++secrets
module for generating cryptographically strong pseudo-random numbers3.7
asyncio
++A lot of the features of the most recent Python versions are of great interest to us, what should be our new minimum requirement. If we go down the conda packaging route requiring a more contemporary Python might not be such a big issue?
Yes, I think we can choose whatever we like if we go down the Conda packaging route - so the answer hopefully is going to be 3.7. (The only thing to look out for is whether there are any libraries we are going to use that may have any issues with cutting edge Python.)
so the answer hopefully is going to be 3.7
+1
For reference, various projects I work on currently have 3.3, 3.4 or 3.5 as the minimum supported version. Ubuntu 18.04 (the latest long term support release) includes Python 3.6.
Conda is popular, so preparing conda packages is a good idea. But expect pushback if you make that the only practical way to install your software: for a variety of reasons, people like to have other installation options.
Conda and pip are reasonably interoperable ... starting with something pip installable is probably the method which will be usable by the largest number of people. Adding conda packaging support at a later stage would be reasonable.
On Thu, 4 Oct 2018 at 16:46, Thomas Kluyver notifications@github.com wrote:
For reference, various projects I work on currently have 3.3, 3.4 or 3.5 as the minimum supported version. Ubuntu 18.04 (the latest long term support release) includes Python 3.6.
Conda is popular, so preparing conda packages is a good idea. But expect pushback if you make that the only practical way to install your software: for a variety of reasons, people like to have other installation options.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cylc/cylc/issues/1874#issuecomment-426903915, or mute the thread https://github.com/notifications/unsubscribe-auth/AADuGi8N89jZqoNWU5T67Xr9Nr9DKi_fks5uha7FgaJpZM4IrDOw .
Tennessee Leeuwenburg http://myownhat.blogspot.com/ "Don't believe everything you think"
Close. The list in the description should now have follow-on issues or else.
~This will eventually be required, for future-proofing cylc.~ Probably quite straightforward,
but depends on #1872 because the current Pyro-3 communications layer is not Python 3 compatible.[that's done]Edited by @matthewrmshin:
There is now a clear road map for this. See https://github.com/cylc/admin/blob/master/cylc-8-roadmap.md for example.
Issues that will hopefully be fixed as part of our Python 3 porting effort:
os.makedirs
pipes.quote
argparse
to improve CLI.cylc.scheduler
main loop to use asyncio.