MDAnalysis / GridDataFormats

GridDataFormats is a pure Python library to handle data on a regular grid using commonly used file formats in molecular simulations.
https://mdanalysis.org/GridDataFormats
GNU Lesser General Public License v3.0
29 stars 18 forks source link

use mrcfile library for MRC/CCP4 file parsing #100

Closed orbeckst closed 2 years ago

orbeckst commented 2 years ago
pep8speaks commented 2 years ago

Hello @orbeckst! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 169:80: E501 line too long (82 > 79 characters) Line 206:80: E501 line too long (83 > 79 characters)

Line 118:80: E501 line too long (81 > 79 characters)

Line 96:63: E262 inline comment should start with '# ' Line 98:80: E501 line too long (80 > 79 characters) Line 106:80: E501 line too long (81 > 79 characters) Line 107:80: E501 line too long (84 > 79 characters) Line 109:80: E501 line too long (89 > 79 characters) Line 110:80: E501 line too long (87 > 79 characters) Line 111:80: E501 line too long (84 > 79 characters) Line 112:80: E501 line too long (89 > 79 characters) Line 120:80: E501 line too long (98 > 79 characters) Line 123:80: E501 line too long (102 > 79 characters) Line 125:80: E501 line too long (83 > 79 characters) Line 149:1: W391 blank line at end of file

Line 13:80: E501 line too long (84 > 79 characters)

Line 16:80: E501 line too long (90 > 79 characters) Line 33:80: E501 line too long (90 > 79 characters)

Line 13:1: E302 expected 2 blank lines, found 1 Line 17:1: E302 expected 2 blank lines, found 1 Line 24:1: E302 expected 2 blank lines, found 1 Line 27:1: E302 expected 2 blank lines, found 1 Line 30:1: E302 expected 2 blank lines, found 1 Line 39:1: E303 too many blank lines (3) Line 43:1: E302 expected 2 blank lines, found 1 Line 56:25: E128 continuation line under-indented for visual indent Line 58:80: E501 line too long (89 > 79 characters) Line 75:80: E501 line too long (92 > 79 characters) Line 76:80: E501 line too long (92 > 79 characters) Line 77:80: E501 line too long (92 > 79 characters) Line 78:80: E501 line too long (92 > 79 characters) Line 79:80: E501 line too long (92 > 79 characters) Line 80:80: E501 line too long (92 > 79 characters) Line 81:80: E501 line too long (92 > 79 characters) Line 82:80: E501 line too long (92 > 79 characters) Line 83:80: E501 line too long (92 > 79 characters) Line 84:80: E501 line too long (93 > 79 characters) Line 93:1: E302 expected 2 blank lines, found 1 Line 99:1: E302 expected 2 blank lines, found 1 Line 101:30: E203 whitespace before ',' Line 101:41: E202 whitespace before ']' Line 102:19: E203 whitespace before ',' Line 102:41: E202 whitespace before ']' Line 103:19: E203 whitespace before ',' Line 103:30: E203 whitespace before ',' Line 105:1: E302 expected 2 blank lines, found 1 Line 110:75: E202 whitespace before ']' Line 112:1: E302 expected 2 blank lines, found 1 Line 118:1: E302 expected 2 blank lines, found 1

Comment last updated at 2022-02-20 21:01:37 UTC
codecov[bot] commented 2 years ago

Codecov Report

Merging #100 (3129570) into master (01dfa7d) will increase coverage by 0.37%. The diff coverage is 97.43%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #100      +/-   ##
==========================================
+ Coverage   87.69%   88.07%   +0.37%     
==========================================
  Files           5        6       +1     
  Lines         821      855      +34     
  Branches      141      146       +5     
==========================================
+ Hits          720      753      +33     
  Misses         60       60              
- Partials       41       42       +1     
Impacted Files Coverage Δ
gridData/__init__.py 100.00% <ø> (ø)
gridData/mrc.py 96.96% <96.96%> (ø)
gridData/CCP4.py 86.66% <100.00%> (+0.14%) :arrow_up:
gridData/core.py 94.40% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 01dfa7d...3129570. Read the comment docs.

orbeckst commented 2 years ago

For reviewers: I intend to ignore the PEP8 formatting warnings... if you feel strongly about it then I suggest we run everything through black from here on out. I don't have the time to spare to manually prettify code that started out pretty ugly anyway.

orbeckst commented 2 years ago

@MDAnalysis/coredevs if you want to have a quick look then that would be much appreciated. If you think it's good enough then you can approve it — I am mainly looking for a second pair of eyes to avoid really stupid mistakes.

If nobody has time I'll merge it by the end of the week latest in order to get a release out.

IAlibay commented 2 years ago

If nobody has time I'll merge it by the end of the week latest in order to get a release out.

I've penciled time in to review on Thursday if that's ok? (assuming no one else gets to review first)

tluchko commented 2 years ago

@orbeckst, thank you for implementing this. Doing so has been on my to-do list for a long time but never managed to get to it.

The code looks good to me but I have two questions about functionality.

  1. Why are non-orthorhombic unit cells not supported? It seems they are supported for the DX format, as I get no complaints when I load such a file.
  2. Can we add support for writing MRC files? I played around with this and I think it will be straightforward. However, I don't know how to recover the unit cell angles for non-orthorhombic from a Grid object.

I can add writing MRC files for orthorhombic cells, if you like. Once I understand the issues for non-orthorhombic cells, I should be able to add read/write support for these as well.

orbeckst commented 2 years ago

Thanks for review and comments. This week I definitely won't have time to look at anything, sorry.

orbeckst commented 2 years ago

@tluchko can we try to get the simple reading code in and then add the enhancements, namely writing? (EDIT: I'd ❤️ a code contribution from you for the writing)

For the non-orthorhombic cells I can't quite remember why I restricted it to orthorhombic. There might be something in the Grid class that's not working with triclinic cells. I'd try to get this PR done with the limitation and then raise a new issue for triclinic cells.

orbeckst commented 2 years ago

The code with sorted mapc/r/s transposed axes does not fix #76 . In ChimeraX

1cbs.ccp4 :

This looks as if mrcfile does indeed always reorder the data into z, x, y order.

orbeckst commented 2 years ago

@tluchko and @IAlibay I addressed most of your comments and after a long time thinking about what mrcfile does and what the mapc/r/s parameter mean I managed to get the conversion of MRC data to Grid to work (and fixed the super-irritating #76 along the way). This should be good for a final review.

If you can find a nicer way to write the double-transposition

            axes_order = np.hstack([h.mapc, h.mapr, h.maps])
            axes_c_order = np.argsort(axes_order)
            self.array = np.transpose(np.transpose(mrc.data), axes=axes_c_order)

then let me know. However, just reversing axes_c_order does not work as far as I can tell... there must be a way to permute the axes_c_order directly but I am too tired to figure it out. (The first transpose is always (210) (rearrange zyx to xyz), the second one depends on the mapc/r/s values and is in the test (102) (x is really y, y is really x, and z is z). The end result needs to be (201) ((zyx) -> (yxz)).

orbeckst commented 2 years ago

Honestly, I don't know why reversing axes_order_c did not work for me because (102) reversed is (201) and that's also what you get when applying (210) to (102). But

self.array = np.transpose(mrc.data, axes=axes_c_order[::-1])

gives the wrong shape.

orbeckst commented 2 years ago

I rewrote the history so that this can be merged as is.

orbeckst commented 2 years ago

Thank you @tluchko and @IAlibay !

orbeckst commented 2 years ago

Following up on https://github.com/MDAnalysis/GridDataFormats/pull/100#issuecomment-1009487765, @tluchko please open issues for

Thanks!

tluchko commented 2 years ago

Thank you @orbeckst!