Closed tclick closed 7 years ago
Thanks for reporting this. We currently consider python3 support experimental. There will be a bigger effort in summer to get MDAnalysis python 3 compatible.
@tclick Which version of Python did you use? Using Python 3.6 and MDA 0.16.2 the problem is less severe / clear with the following code:
import MDAnalysis
print('MDAnalysis.__version__:', MDAnalysis.__version__)
from MDAnalysisTests.datafiles import TPR
u = MDAnalysis.Universe(TPR)
print('binary u:', u)
sel = u.select_atoms('all')
print('binary sel:', sel)
print('binary sel.n_atoms:', sel.n_atoms)
producing this output:
MDAnalysis.__version__: 0.16.2-dev0
binary u: <Universe with 47681 atoms>
binary sel: <AtomGroup [<Atom 1: b'N' of type b'opls_287' of resname b'MET', resid 0 and segid seg_0_b'AKeco'>, <Atom 2: b'H1' of type b'opls_290' of resname b'MET', resid 0 and segid seg_0_b'AKeco'>, <Atom 3: b'H2' of type b'opls_290' of resname b'MET', resid 0 and segid seg_0_b'AKeco'>, ..., <Atom 47679: b'NA' of type b'opls_407' of resname b'NA+', resid 11299 and segid seg_2_b'NA+'>, <Atom 47680: b'NA' of type b'opls_407' of resname b'NA+', resid 11300 and segid seg_2_b'NA+'>, <Atom 47681: b'NA' of type b'opls_407' of resname b'NA+', resid 11301 and segid seg_2_b'NA+'>]>
binary sel.n_atoms: 47681
@tylerjereddy, you used the 'all' selection. Try something like I did 'name CA'. In fact, I even tried 'backbone', and received the following exception:
TypeError: '>' not supported between instances of 'str' and 'bytes'
This is tested using Python 3.6.1 and MDA 0.16.1.
@tclick Thanks, I can reproduce this now, and this extra information will likely be diagnostically relevant / useful for writing unit tests, etc.
You're welcome, and I'm glad I can be of service.
I can't reproduce the issue with name CA
or backbone
selections in MDAnalysis 0.17.0-dev
. Has anything changed that might impact the internals for this? I assume not intentionally since I don't see a reference to this issue.
@jbarnoud did #1435 and #1436 recently, might have accidentally fixed this
Ok, we should probably wait for confirmation on that. If true, the other question for @jbarnoud might be if we need additional unit tests here of if the previous fixes were sufficient?
I think there's probably a lot of things wrt to py3 (eg this string handling) that currently "accidentally" work, and we should shore them up with additional tests.
There should be a test that the parsers populate the topology with strings and not bytes. I do not know if we should have a test to make sure that we can do selections with each parsers, this will be a lot of tests for something that should be caught at a lower level.
I think, as pointed out by @richardjgowers in #1444, that pytest will make that quite straightforward to implement. And if we use fixtures that won't necessarily become very expensive.
Yeah, it would be cool to have fixture which loops over all known Universes/Readers/Topologies. Then we could write a test like:
def test_str_types(all_universes):
assert isinstance(all_universes.atoms[0].name, str)
Where all_universes
is a magic fixture which handles all possible cases
Expected behaviour
When using select_atoms, one should be able to input a selection string and return an AtomGroup regardless of the universe input files.
Actual behaviour
If a binary topology file (e.g., Gromacs TPR file) is used, all string information in Python 3 is considered a byte string instead of a unicode string. Therefore a simple selection yields an empty AtomGroup. It works if the topology file is a text file (e.g., CHARMM PSF).
Code to reproduce the behaviour
Currently version of MDAnalysis:
0.16.0