Closed Infernio closed 2 years ago
zip et al. return iterators in py3 - this can easily be handled by automated tools if we change them to the izip etc
not needed automated tools will change those to list(zip) we just need to clear the unneeded list after
Unicode strings vs byte strings
not an issue (happily python3 supports u prefix) except ur'strings'
which
is a syntax error - haven't found a satisfactory solution yet ( see https://stackoverflow.com/a/43359624/281545 )
For a nice guide see: https://medium.com/@boxed/moving-a-large-and-old-codebase-to-python3-33a5a13f8c99
Thanks @Infernio for starting this - was exactly on my mind (and thanks for reading through my technical rants posts in commits and issues as you do)
On Tue, Sep 10, 2019, 00:26 Infernio notifications@github.com wrote:
(Opening this as a central discussion hub, so we can mention it in commits that work toward py3 support and because certain people on Discord keep asking about it 😛)
Goold old Python 2 is dead in about 3 months as of the time of writing https://pythonclock.org/.
What follows is a list of roadblocks:
- 307.beta4. This needs to be out ASAP (i.e. after wxPython 3 migration is stable and Enderal support is merged), because the vast majority of people download WB from the Nexus, where they're getting a version that's a year out of date and therefore riddled with long-fixed bugs, but telling people to use a nightly version almost always leads to a 'huh? I thought the nightlies were the unstable ones?' reaction
- wxPython. Neither 2.8 https://sourceforge.net/projects/wxpython/files/wxPython/2.8.12.1/ nor 3.0 https://sourceforge.net/projects/wxpython/files/wxPython/3.0.2.0/ have Python 3 versions available. However, the work put into the wx3 upgrade was not wasted; it will make upgrading to wxPython 4.0 https://pypi.org/project/wxPython/, which does have Python 3 support, much easier.
- loot-api-python. Python 3 versions are available starting from version 4.0 https://github.com/loot/loot-api-python/releases, but see #431 https://github.com/wrye-bash/wrye-bash/issues/431 for the obvious problem there.
- Well, the entire codebase. A decent chunk can probably be handled by automated tools, but we need to consider the output carefully - plus automated tools won't be nearly enough to catch everything. In particular:
- Unicode strings vs byte strings
- Old-style classes need to be switched to new-style classes ahead of time to catch any issues that arise - see the commit message in fa74a1b https://github.com/wrye-bash/wrye-bash/commit/fa74a1b7a61d9b3150f0d2b171145e171f2d27e5, for example
- zip et al. return iterators in py3 - this can easily be handled by automated tools if we change them to the izip etc. equivalents beforehand. I started a commit like that, but it seems to have gotten lost in the wind somewhere...
- True division updating - for each usage, we need consider if we want floats or integers
- Relative imports - are we using the explicit ones everywhere already? Should mostly be caught by automated tools, but problems can arise here
- cmp and cmp are gone; usage is rare - five in ScriptParser and belt and one each in cint, balt and mods_metadata
- Don't even worry about trivial stuff like print statements or repr backticks - easily handled by automated tools. Although print statements aren't the greatest thing anyways (see #352 https://github.com/wrye-bash/wrye-bash/issues/352)...
Things that are not roadblocks:
- comtypes - dropped during wxPython 3.0.2.0 upgrade
- gitpython - dropped in build script rework, see pygit2 below
- py2exe - not entirely sure; 0.9+ only supports Python 3 - I think this is why we host our own wheel at version 0.6.9(?)
- pygit2 - readily supports Python 3
- pywin32 - readily supports Python 3
- scandir - built into Python 3, we'll just drop this once we upgrade
Some resources:
- Supporting Python 3: An in-depth guide http://python3porting.com/
- and by 'in-depth' this does mean in-depth
- Porting Python 2 Code to Python 3 https://docs.python.org/3/howto/pyporting.html - a general overview from python.org
- 2to3 https://docs.python.org/2/library/2to3.html - automated code porting tool
Added to 308, may be addressed sooner - there's lots of factors at play here Supersedes #194 https://github.com/wrye-bash/wrye-bash/issues/194 and
404 https://github.com/wrye-bash/wrye-bash/issues/404
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/wrye-bash/wrye-bash/issues/460?email_source=notifications&email_token=AAKNIV4YVF3ZI2SIZ7L7CKDQI25PBA5CNFSM4IVAKWCKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HKJFZ6Q, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKNIV4SGYO2MSDAZMYZQGDQI25PBANCNFSM4IVAKWCA .
One thing I have installed but am not using at the moment is autopep8 but have tested it to see how it works.
I would think you are all aware of the official conversion docs from python.org which provides suggestions to many things to test python 3 compatibility and provide suggestions toward a more stable 2.7/3 environment to help with the transition.
so i was a bit bored and just as a wip wanted to see what all is broken in python 3: see branch lojack-py3.
Not too bad, but it's going to take some picking through the PBash code with a fine tooth comb to find all the instances where we want bytes instead of str.
Not to mention there's still a lot of cosmetic things that aren't quite right with the updated wxPython.
Interesting! Had a very very quick look - no need to drop u' prefix from unicode strings it's still valid in py3 - and it will probably help us single out the bytes
. Some of the wx fixes are already present in 15-wx3 branch, better merge that in first. But in all good we have an initial set of issues to looka at. have a look at https://medium.com/@boxed/moving-a-large-and-old-codebase-to-python3-33a5a13f8c99 seems interesting.
Me and Infernio are working on the patcher - a good opportunity to look at strings vs unicode there.
Oh and if you are very bored could you have a look at porting CBash to x64? That will really help - we don't want to carry the python32 limitation in py3. Infernio started on it in https://github.com/wrye-bash/CBash/issues/23
Ah yeah, the u'' -> ''
was all from 2to3, couldn't see an option to disable that fixer in it. Was mostly just a fun opportunity to see how much work would be required to make it work on Python 3.
Might take a look at CBash, but honestly the most interesting thing I think for me will be getting bolt.Path setup for the swapover to pathlib.
See #368. A laudable goal but must come after the dust from moving to py3 settles - bear in mind that I had refactored Path using pathlib source tricks and that gave us a speed up and memory savings. So Path/GPath are good enough for now plus they are everywhere, including pickled settings so this would involve backward compatibility hacks that should be addressed all at once under #178 - yet another post py3
A think neither me nor Infernio have found time to tackle yet is progress fixups on 15-wx3-upgrade branch - namely, Bash loses focus after some progress is displayed including on boot. That's the last (?) wrinkle to iron out for the wx3 merge which is a blocker for py3
I edited and stuck two commits from lojack-py3 that looked backwards-compatible onto nightly: 2ac1c1b5253bff8eb3172a3d13c218e01531d62f and 47a978daebb8ab8fd3a0f338558802c86bd7c948.
@lojack5 well whatever is the most fun is probably best. :smile:
The project could use any help you have time to provide. Good to see you.
Looks like @Utumno isn't the only one who got this traceback when trying to use the Python version with chardet 3.0+. From the AFKMods thread:
Traceback (most recent call last):
File "Wrye Bash Launcher.pyw", line 89, in <module>
bash.main(opts)
File "bash\bash.py", line 204, in main
wx_locale = localize.setup_locale(opts.language, _wx)
File "bash\localize.py", line 89, in setup_locale
target_name, target_locale.GetCanonicalName()))
File "bash\bolt.py", line 1666, in deprint
msg = u'%s %4d %s: ' % (GPath(file_).tail.s, line, function)
File "bash\bolt.py", line 322, in GPath
else: norm = os.path.normpath(decode(name))
File "bash\bolt.py", line 126, in decode
encoding,confidence = getbestencoding(byte_str)
File "bash\bolt.py", line 107, in getbestencoding
result = chardet.detect(bitstream)
AttributeError: 'module' object has no attribute 'detect'
py -2 -m pip install -U -r requirements.txt
didn't work for the user either:
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting https://bintray.com/loot/snapshots/download_file?file_path=loot_api_python-4.0.2-7-g580bf87_master-python2.7-win32.zip (from -r requirements.txt (line 8))
Downloading https://bintray.com/loot/snapshots/download_file?file_path=loot_api_python-4.0.2-7-g580bf87_master-python2.7-win32.zip (5.3MB)
|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU| 5.3MB 1.9MB/s
Requirement already up-to-date: chardet~=3.0 in c:\dev\python27\lib\site-packages (from -r requirements.txt (line 2)) (3.0.4)
Requirement already up-to-date: pywin32<225,>=220 in c:\dev\python27\lib\site-packages (from -r requirements.txt (line 3)) (224)
Collecting wxPython==3.0.2.0
Using cached https://github.com/wrye-bash/dev-tools/raw/master/wheels/wxPython-3.0.2.0-cp27-cp27m-win32.whl
Requirement already up-to-date: scandir~=1.9 in c:\dev\python27\lib\site-packages (from -r requirements.txt (line 6)) (1.10.0)
Requirement already up-to-date: pygit2~=0.28 in c:\dev\python27\lib\site-packages (from -r requirements.txt (line 9)) (0.28.2)
Requirement already up-to-date: pyfiglet~=0.8 in c:\dev\python27\lib\site-packages (from -r requirements.txt (line 10)) (0.8.post1)
Collecting py2exe==0.6.9
Using cached https://github.com/wrye-bash/dev-tools/raw/master/wheels/py2exe-0.6.9-cp27-cp27m-win32.whl
Requirement already satisfied, skipping upgrade: cffi in c:\dev\python27\lib\site-packages (from pygit2~=0.28->-r requirements.txt (line 9)) (1.13.2)
Requirement already satisfied, skipping upgrade: six in c:\dev\python27\lib\site-packages (from pygit2~=0.28->-r requirements.txt (line 9)) (1.13.0)
Requirement already satisfied, skipping upgrade: pycparser in c:\dev\python27\lib\site-packages (from cffi->pygit2~=0.28->-r requirements.txt (line 9)) (2.19)
Installing collected packages: wxPython, py2exe, loot-api
Found existing installation: wxPython 3.0.2.0
Uninstalling wxPython-3.0.2.0:
Successfully uninstalled wxPython-3.0.2.0
Found existing installation: py2exe 0.6.9
Uninstalling py2exe-0.6.9:
Successfully uninstalled py2exe-0.6.9
Found existing installation: loot-api 4.0.2
Uninstalling loot-api-4.0.2:
Successfully uninstalled loot-api-4.0.2
Running setup.py install for loot-api ... done
Successfully installed loot-api-4.0.2 py2exe-0.6.9 wxPython-3.0.2.0
Yep and it's indeed because of left over pyc files in Mopy/bash/chardet
folder - wouldn't harm to delete pyc files in there at boot although it's a bit dirty
The issue raised by lojack on str vs unicode is a concern - did some regexing and we have ~22k strings (not unicode) - this means every fourth line in the code - we should keep an eye and convert some to unicode or eliminate but still it' a lot...
I got ~22000 with [^u]'\b[^'.]+\b'
, but I'm not that great with regexes so there's probably a better one.
Biggest targets, ordered by number of non-unicode strings:
game/*/records.py
: All the PBash record attributes (not the signatures, those must remain as bytestrings, and not the format strings, those can't be unicode in py2) could be changed to unicode, since they are only used for setattr/getattr - I've just kept them all as regular strings by convention right nowgame/*/constants.py
: The CTDA section strings look to be purely for informational purposes, so those could become unicode. Not sure about gmstEids
, it gets compared to pickled stuff, so I'm basically already out of my depth there :Pcint.py
: Same as records.py
above? Not sure about this, it's cint after all :P{importers,multitweak_*,parsers}.py
- as mentioned above, keep an eye out for signatures which will probably have to remain as bytestrings.Yeah, and unfortunately there's some implicit assumptions on how those bytes vs unicode strings are used in calling code all over the place as well. My branch was mostly just a POC to see if we could get Bash to start and mostly function on python 3 with phoenix. I'm fairly confident that it can't actually make a bashed patch due to the bytes vs. unicode issue.
For future reference, here is a tiny script that helps finding all the uses of division:
import ast
import fnmatch
import os
for root, _, filenames in os.walk('Mopy'):
for filename in fnmatch.filter(filenames, '*.py'):
path = os.path.join(root, filename)
linenos = []
with open(path, "rt") as fopen:
content = fopen.read()
tree = ast.parse(content, filename=filename)
last_lineno = None
for node in ast.walk(tree):
# Not all nodes in the AST have line numbers, remember latest one
if hasattr(node, "lineno"):
last_lineno = node.lineno
# If this is a division expression, then show the latest line number
if isinstance(node, ast.Div):
linenos.append(last_lineno)
if linenos:
print
print path
print ", ".join((str(l) for l in linenos))
@wrye-bash/bashers Please mark anything you come across that needs special handling during py3 upgrade, and can't be handled in py2 due to backwards compatibility issues, with a PY3:
comment:
# PY3: Drop unicode=True, removed in py3
trans.install(unicode=True)
Basically, similar to how we use HACK
right now.
@lojack5 interesting work on Path - not sure about using the /
operator though - it's not searchable and as the issue with division shows, we might better leave /
for numbers. joinpath
is perfectly searchable on the other hand, I would go for that in all cases
I like /
personally, consistent with other languages that have path libraries, but either way is fine. I'll probably keep working for now though, see if this approach is even worthwhile (morphing bolt.Path to look like pathlib.Path slowly). If it is, I can split off the /
stuff specifically.
Pushed a lojack-py3-py3-dev1 till got tired of resolving conflicts - some of it is py2 compatible and should land on nightly like 9a5133f2811f090478e14dfaf8e8ca2316e92cbd, fcde88c5a8e24e7c9189a3e02f8eefc6c51a6e54, 0c6e1da3ef2fe724c59f112ac5e603f9af344e59 as is some work by @GandaG on his 460 branch - the sooner we rebase merge those (especially the unicode sting prefixing) the better as we should have quite a few conflicts otherwise
I'm pretty sure Ganda's 460 branch is entirely backwards compatible, that was one of his goals (correct me if I'm wrong @GandaG :P)
Edit: pushed all (I think it's all of them) backwards-compatible changes from the lojack-py3 branches to nightly in c6d0f98e78d6a2fb747afbc8484b61eec1ccb472. The ArtProvider stuff is obsoleted by wx-begone.
Other than other folks python builds of wrye bash, the biggest issue is going to wx'Phoenix' as far as gui. The jump from 3.0 to phoenix wont be hard like 2.8 to 3.0.
The wxPython Project Phoenix development documentation is here: This is built from the latest snapshot build. Probably better than what your using, depending on what version you are looking for. https://wxpython.org/Phoenix/docs/html/index.html
The regular stable wxPython Project Phoenix documentation(The PyPi Version) is here: https://docs.wxpython.org/
NOTE: The differences in the urls
otherwise see the site https://wxpython.org/ or ask Robin.
@GandaG I rebased your ganda-460 branch (see the creatively named ganda-460-rebased), fixing all conflicts, updating it to prefix recently introduced strings too (I hope I caught them all :P) and adding a few more WIP commits on top of it.
Here's a script to find unprefixed strings in a file (or whole directory): https://gist.github.com/Infernio/813b24ddb1dd616c3533c1220cf4a4ab
Edit: I would suggest not reading the code, it's extremely hacky :P Just blindly trust it, it'll be fine.
Alright, that script seems to be far more accurate than the regexes. Before ganda-460-rebased
:
==> Total unprefixed strings found: 39454
After ganda-460-rebased
:
==> Total unprefixed strings found: 36022
Yay, still 36000 to go... still, shaving off 3000+ strings is pretty impressive 😄
It's also not quite as bad as it seems, the vast majority are in game/*/records
and game/*/constants
, and almost all of those could be handled through search and replace with regexes (e.g. to handle all record signatures, '([A-Z]{4})'
-> b'$1'
would do).
If you want a full dump, here they are:
Good job! Of course the issue is not really prefixing the stings is deciding which ones need to be bytes - even the record signatures could stay unicode and decode them - not sure if performance will be affected but if not (really) we could even go down that path
Would probably be more work to find all the places where the signatures are written out than just regexing for them and making them b''-strings.
But yes, there are a lot of places where we do stuff like:
s = read_some_string()
if s == 'whatever':
do_something()
And all those string constants can be unicode, because UTF-8 is backwards-compatible with ASCII. Same with e.g. arguments to open
, getattr
, etc.
My biggest problem right now is bass.settings
. Those are the only strings I've left completely untouched, because of the pickling involved. I think we might need to add some backwards-compat code that turns the stored settings into unicode and drop that in 308?
I think we might need to add some backwards-compat code that turns the stored settings into unicode and drop that in 308?
and some code that prints out pickled class so we finally only pickle builtins. In py2 we can compare strings with unicode, so just start pickling unicode won't probably break anything
even the record signatures could stay unicode and decode them
Thinking about this again, that might actually make sense - I think it'll be easier for the py3 transition to not do that so the record signatures stick out, but it's something to think about once we're on py3. Working with unicode internally and encoding/decoding at IO boundaries is generally a better idea than passing bytes around, after all.
I finished off cint (would not recommend looking at that commit, it's a fairly unreadable and highly repetitive diff :P) and a few hundred more random strings in other files. Now down to:
==> Total unprefixed strings found: 26849
Note that ganda-460-rebased
doesn't launch anymore - that's on purpose, I added a ton of asserts to brec to catch strings that should be unicode / should remain as bytes. But I haven't gotten around to changing any of them yet, so it will crash as soon as you select a game.
Rebased the ganda-460-rebased
on nightly - omg. Do not put any more work on it for now we should think of a better way - probably decode the record headers? If performance is minimal then it should work out of the box - other cases are kwargs (in wx must be bytes but probably fixed in wx4) and windows.py stuff (needs testing)
Yeah, that is going to be ugly :P I don't know the best way to go about it either... maybe we should just stick them on dev unchecked after de-wx is merged? And then deal with any resulting fallout as it arrives
I would probably do bunches of them like settings
,dirs
, __slots__
etc
Another thing to watch out for:
PS C:\Users\Infernio> py -2
Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> struct.pack('I', 1.5)
'\x01\x00\x00\x00'
>>> exit()
PS C:\Users\Infernio> py -3
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> struct.pack('I', 1.5)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: required argument is not an integer
In general, py3 seems to be much more picky when it comes to struct types (seems faster in exchange though).
Edit: thankfully, integer->whatever seems to work fine in py3, so I don't have to devise an algorithm to scan our MelStruct format strings and derive defaults:
PS C:\Users\Infernio> py -3
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import struct
>>> struct.pack('f', 1)
b'\x00\x00\x80?'
And another fun gotcha:
PS C:\Users\Infernio> py -2
Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class foo:
... a = 10
...
>>> f = foo()
>>> getattr(f, u'a')
10
>>> getattr(f, b'a')
10
>>> exit()
PS C:\Users\Infernio> py -3
Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> class foo:
... a = 10
...
>>> f = foo()
>>> getattr(f, u'a')
10
>>> getattr(f, b'a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: getattr(): attribute name must be string
Especially fun in parsers
, a few of the prefixes were wrong and would have passed bytestrings to getattr...
Edit: this will probably affect a whole bunch of patchers actually - may have to rethink all those getattr(modFile, type).getActiveRecords()
calls we do
Pushed yet another rebase of 460-pre-py3 and will start leaking on nightly cause maintenance becomes harder and harder, plus it must hit nightly anyway. I omitted the brec commit as it conflicts heavily with the splitting - not all of it may be needed anyways (especially in newer branches)
Quick hack I came up with to emulate the getattr/setattr behavior of py3 in py2: https://gist.github.com/Infernio/2baa914ad6cf4bb4b3b2c170463a6615
Obviously doesn't work with __setattr__
, __getattr__
and __getattribute__
, so not sure how useful this would really be.
Some info on opening a py2 pickle in py3: https://stackoverflow.com/questions/28218466/unpickling-a-python-2-object-with-python-3
TL;DR is that it will try to decode all strings as ASCII, unless you tell it to leave them as bytestrings. This shouldn't affect the settings in any way (since they're all ASCII), but we'll have to check that we aren't storing anything non-ASCII as py2 strings in our pickles.
Edit: and the problem with just setting encoding='bytes'
and calling it a day is of course this:
PS C:\Users\Infernio> py -2
Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> b'foo' == u'foo'
True
PS C:\Users\Infernio> py -3
Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:57:54) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> b'foo' == u'foo'
False
In my app I save my settings as pickle and ini so when I jump to a different OS or Py2/3 I can retain them. Usually when the error occurs it is pickle protocol, so you should prob check that and if fails resort to ini if possible or other way to workaround.
The Installers.dat (don't remember the exact filename) has pickled bolt.Path
s in it. Probably easiest to just rebuild the installers data on a failure, since in most cases the filenames will decode just fine.
Those should already be unicode, IIRC we decode when constructing bolt.Path
s.
See Path.__init__
:
https://github.com/wrye-bash/wrye-bash/blob/604ebd31d2f20d26c06ab2d043f114f7e0abc26b/Mopy/bash/bolt.py#L477-L482
That calls Path.__setstate__
, which decodes:
https://github.com/wrye-bash/wrye-bash/blob/604ebd31d2f20d26c06ab2d043f114f7e0abc26b/Mopy/bash/bolt.py#L488-L493
Pretty sure I've already noted this somewhere in a commit, but just for reference: getattr
beats everything else in py3, including storing the bound __getattribute__
method in a variable: https://gist.github.com/Infernio/f3a2cb592afe805c5ff691f79f8e8e5d
Just an FYI from some stuff I've been working on that's lead to issues: for now I'd suggest targeting Python 3.8. Python 3.9 brought some ABI changes so a lot of compiled libraries are still catching up. Might not be an issue for Wrye Bash, because wxPython at least has nightlies for 3.9, though no stable releases yet.
We definitely want to stay on wxPython 4.0.7.post2, the 4.1.x releases are unstable ones (wxWidgets and wxPython use a versioning system of x.y.z, where y being even indicates stable releases and y being odd indicates unstable releases). wxPython 4.2.x will be the next stable series of releases.
May be useful, found this randomly: https://pypi.org/project/future/ Fills in some of the gaps from python 2's future imports, aliases some things to what they will be when you update to python 3.
That is quite nice, might be helpful to cut down on 2to3 noise.
And another thing to watch out for:
PS C:\Users\Infernio> py -3
Python 3.9.0 (tags/v3.9.0:9cf6752, Oct 5 2020, 15:34:40) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 'Scanning: %s' % b'NPC_'
"Scanning: b'NPC_'"
>>> exit()
PS C:\Users\Infernio> py -2
Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 'Scanning: %s' % b'NPC_'
'Scanning: NPC_'
Quick question: how disruptive would it be to go ahead and prefix all the strings in brec
and games
? Dunno how much conflict resolution that would cause.
Goold old Python 2 is dead in about 3 months as of the time of writing.
What follows is a list of roadblocks:
chardet
. We bundle version 1.0.1, from 2008. May just happen to work on py3, but I wouldn't hold my breath - done in 8e201c49bbc809da89b1bda1d269f4cb7619dfc0.wxPython
. Neither 2.8 nor 3.0 have Python 3 versions available. However, the work put into the wx3 upgrade was not wasted; it will make upgrading to wxPython 4.0, which does have Python 3 support, much easier - see 050391ca22d7c8451390cbff6fd150ab0b9bcabd, done in 22de7ff9e804b2bcaa8a819922f4b5100b86d80f.loot-api-python
. Python 3 versions are available starting from version 4.0, but see #431 for the obvious problem there - done in 934f3527e59ac8b0f923c2f6dc6908ad902a1c46.ur''
is gone - done in 660ecbb81f49d478bdc8e4a8905e328d1daf9dca and d43ad244170e2110a6daca7d5febed4020550247b''
oru''
- done in way too many commits to liststr
needs to be replaced withbytes
orunicode
depending on how it's used - done in 04835ecce0f82d59f02faba38099ebaa74633922basestring
is problematic - it's gone in py3, but 2to3 will change it tostr
. Most usages should be replaced withunicode
orbytes
, but a couple really do need to handle both - done in 42b879faa847df37bd75d5cc8d4d1d38a7e7dfc4open
uses, check if opening in bytes mode is needed - done in c120e18a43c2f5abb31d579c99f211e7aee6b322getattr
calls that pass in bytes, most notably due toModFile.__getattr__
- #312/#480, (hopefully) done in 0848872b18c6fb6554cc757a3372d7ed86e613c9zip
/map
/filter
/range
need to get replaced with either theitertools
/xrange
variant or an equivalent comprehension - plus addlist()
calls if we modify the iterables in the loop - worked on in 709e146b3d968af527bf1d055fa86e1ca4eaaa3d and e8c4d4388d1c3747c48473975c4293c2cfe3f314, done in a75dbc638a349e7dc893aabd7823cb25d2dfb2eakeys
anditerkeys
calls can go -> done in 567f1539ed2e6708554cc47ef4656febd13d92b3 and 09e1633a617aeec6ebb5139e6099dc848b825a25items
andvalues
: Change toiteritems
/itervalues
, then run 2to3'sdict
fixer. If it changes them toiter(...)
, change them toviewitems
/viewvalues
instead - done in d22469bc235ab67b14e14dd45e4e3ce268bd936dsys.maxint
is gone - depending on the use case, could be replaced bysys.maxsize
- done in 659e5b696be5083b9bef0d39356acc30ab46b5a4cmp
and__cmp__
are gone; usage is rare - five in ScriptParser and belt and one each in cint, balt and mods_metadata - done in 5a98eb4c025651f4e9366db2a7d488ec2068f1fc'Things that are not roadblocks:
comtypes
- dropped during wxPython 3.0.2.0 upgradegitpython
- dropped in build script rework, see pygit2 belowpy2exe
- readily supports Python 3 - we should however consider our options once we get there, see #491pyfiglet
- readily supports Python 3pygit2
- readily supports Python 3pymupdf
- readily supports Python 3pytest
- readily supports Python 3python-lz4
- readily supports Python 3pywin32
- readily supports Python 3pyyaml
- readily supports Python 3scandir
- built into Python 3, we'll just drop this once we upgradetoml
- readily supports Python 3Useful tracking regexes (make sure to enable *.py mask):
ur
prefix:ur('|")[^,)]
class [^:(]+:
(?<!self\))\.__init__
[A-Z]\w+\.\w+\(self(,|\))
^ *import (?!os|sys|re|errno|time|subprocess|io|threading|wx|wx\.adv|cPickle|argparse|atexit|codecs|ctypes|platform|shutil|traceback|codecs|chardet|lz4|win32api|win32com|yaml|tempfile|Tkinter|collections|copy|csv|copy|datetime|stat|string|struct|textwrap|scandir|inspect|pkgutil|math|gettext|locale|msgfmt|pygettext|operator|webbrowser|array|zlib|binascii|_winreg|win32gui|importlib|random|toml|pytest|glob|logging|zipfile|pygit2|py2exe)
open
usages without false positives:(?<!webbrowser\.)(?<!io\.)(?<!def )\bopen\b
Some resources:
Added to 308, may be addressed sooner - there's lots of factors at play here. There is also no way that I caught every issue we'll encounter on the road to py3. Supersedes: #194 and #404 Follow-ups: #55