SBC-Utrecht / pytom-match-pick

GPU-accelerated template matching for cryo-ET
https://sbc-utrecht.github.io/pytom-match-pick/
GNU General Public License v2.0
28 stars 8 forks source link

Overflow error with tomogram #195

Closed frozenfas closed 2 months ago

frozenfas commented 3 months ago

Hello, I am trying to pytom_match_template using the following command:

/opt/miniconda3/envs/pytom_tm/bin/pytom_match_template.py --template External/job700/templates/emd_16139_ts-L1T3.mrc --mask External/job700/templates/mask_ts-L1T3.mrc --tomogram Tomograms/job014/tomograms/rec_ts-L1T3.mrc --destination External/job700/results --particle-diameter 280.0 --tilt-angles External/job700/datasets/ts-L1T3.rawtlt --low-pass 30.0 --defocus 5e+03 --amplitude-contrast 0.1 --spherical-aberration 2.7 --voltage 300.0 --gpu-ids 0 1 2 3 --tomogram-ctf-model phase-flip --volume-split 2 2 1

and get this error about a

Traceback (most recent call last):
  File "/opt/miniconda3/envs/pytom_tm/bin/pytom_match_template.py", line 8, in <module>
    sys.exit(match_template())
             ^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/entry_points.py", line 853, in match_template
    job = TMJob(
          ^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/tmjob.py", line 300, in __init__
    meta_data_tomo = read_mrc_meta_data(self.tomogram)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/io.py", line 264, in read_mrc_meta_data
    with _wrap_mrcfile_readers(mrcfile.mmap, file_name) as mrc:
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/io.py", line 225, in _wrap_mrcfile_readers
    mrc = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/load_functions.py", line 268, in mmap
    return MrcMemmap(name, mode=mode, permissive=permissive)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 115, in __init__
    self._read(header_only)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 131, in _read
    super(MrcFile, self)._read(header_only)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcinterpreter.py", line 173, in _read
    self._read_data()
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcmemmap.py", line 112, in _read_data
    self._open_memmap(dtype, shape)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcmemmap.py", line 121, in _open_memmap
    self._data = np.memmap(self._iostream,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/numpy/_core/memmap.py", line 277, in __new__
    bytes -= start
OverflowError: Python integer 3713717360 out of bounds for int32

The tomogram in question is made in Relion5 (bin to 10 A/pix). Pytom runs fine on the same tomogram generated in Relion5 with binning to 15 A/pix, so I dont think it is a problem that it was made by relion 5.

Also, if try to open the tomogram in the python interpreter (from the python conda env directory) using mrcfile.mmap it gives the same error:

/opt/miniconda3/envs/pytom_tm/bin/python3
Python 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mrcfile
>>> f = 'Tomograms/job014/tomograms/rec_ts-L1T3.mrc'
>>> mrc = mrcfile.mmap(f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/load_functions.py", line 268, in mmap
    return MrcMemmap(name, mode=mode, permissive=permissive)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 115, in __init__
    self._read(header_only)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 131, in _read
    super(MrcFile, self)._read(header_only)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcinterpreter.py", line 173, in _read
    self._read_data()
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcmemmap.py", line 112, in _read_data
    self._open_memmap(dtype, shape)
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcmemmap.py", line 121, in _open_memmap
    self._data = np.memmap(self._iostream,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/numpy/_core/memmap.py", line 277, in __new__
    bytes -= start
OverflowError: Python integer 3713717360 out of bounds for int32

But if I use mrcfile in a different conda env, it opens without errors:

/miniconda3/envs/team_tomo/bin/python 
Python 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 16:50:38) [GCC 11.2.0] 
on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mrcfile
>>> f = 'Tomograms/job014/tomograms/rec_ts-L1T3.mrc'
>>> mrc = mrcfile.mmap(f)
>>> mrc.voxel_size.x

array(10., dtype=float32)

I only get the error when trying to open in pyTom conda env. Do you have any idea what the problem could be, or can you suggest something I can try to fix it?

Sorry if I am doing something dumb :)

McHaillet commented 3 months ago

Hi @frozenfas! Thanks for reaching out!

File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcmemmap.py", line 121, in _open_memmap
    self._data = np.memmap(self._iostream,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/numpy/_core/memmap.py", line 277, in __new__
    bytes -= start
OverflowError: Python integer 3713717360 out of bounds for int32

Seeing that it errors in np.memmap I suspect it might be something related to Numpy 2.0 . Could you run the following two commands in both environments and let me know the results:

python -c "import numpy; print(numpy.__version__)"
python -c "import mrcfile; print(mrcfile.__version__)"

I just want to know if the version change might be the cause, because that will make debugging easier!

Best, Marten

frozenfas commented 3 months ago

thanks for the quick responce:

in pytom env:

(pytom_tm) $ python -c "import numpy; print(numpy.__version__)"
2.0.0
(pytom_tm) $ python -c "import mrcfile; print(mrcfile.__version__)"
1.5.0

in independent env

(team_tomo) $ python -c "import numpy; print(numpy.__version__)"
1.26.4
(team_tomo) $ python -c "import mrcfile; print(mrcfile.__version__)"
1.5.0

As you suspect the numpy version is different

McHaillet commented 3 months ago

That is very helpful, thanks for checking!

There seem to be some unittest errors in mrcfile with numpy 2.0: https://github.com/ccpem/mrcfile/issues/60. Although, I could not find this specific error in that list.

For you, the easiest solution right now is to install pytom-match-pick without numpy 2.0, by modifying the install command like this:

python -m pip install pytom-match-pick[plotting] 'numpy<2.0'

I am not sure if this will just fix the installation if you run it in your current environment. Perhaps you will need to remake your conda environment.

frozenfas commented 3 months ago

Thanks so much. I rebuilt the entire conda environment, and that workaround fixed the situation. it's running now, if run into another issue I will update

McHaillet commented 3 months ago

For solving this issue: I added a comment on the mentioned mrcfile issue thread to add the experienced issue. Perhaps they know a solution.

McHaillet commented 3 months ago

I didn't hear anything from mrcfile, and I dont have time to fix this myself right now: so we could consider forcing numpy<2.0 in the dependencies

sroet commented 2 months ago

@frozenfas I might have some time to look into this issue over summer. However, we haven't found a way to reproduce your error (yet).

Would it be possible for you to share the broken Tomograms/job014/tomograms/rec_ts-L1T3.mrc ? (I understand if you're not comfortable with that). Feel free to send an email to the email-address mentioned here to arrange for the data transfer or if you want to discuss more

sroet commented 2 months ago

@frozenfas, another thing you might try as mrcfile just released 1.5.1 which should be numpy>2 compatible.

Do you mind just rerunning the install and see if that release also fixed your issue? (they mentioned, not changing that part of the code, but they also have no failing tests)

McHaillet commented 2 months ago

Someone came across (I believe) the same issue in mrcfile and made a fix, see: https://github.com/ccpem/mrcfile/pull/61.

@frozenfas Could you try updating to mrcfile 1.5.2 and see if that fixes it? Doing a clean install would automatically update to this version together with numpy 2

frozenfas commented 2 months ago

Hi. Sorry I have not responded for so long. I sent the problematic tomo this morning to the email you referenced earlier. mrcfile 1.5.2 with numpy2 did not solve the issue. I am getting this error:

`Process Process-2: Traceback (most recent call last): File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/parallel.py", line 51, in gpu_runner result_queue.put_nowait(job.start_job(gpu_id, return_volumes=False)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/tmjob.py", line 742, in start_job read_mrc(self.tomogram)[ ^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/io.py", line 345, in read_mrc with _wrap_mrcfile_readers(mrcfile.open, file_name) as mrc: File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/contextlib.py", line 137, in enter return next(self.gen) ^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/io.py", line 225, in _wrap_mrcfile_readers mrc = func(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/load_functions.py", line 143, in open return NewMrc(name, mode=mode, permissive=permissive, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 115, in init self._read(header_only) File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 131, in _read super(MrcFile, self)._read(header_only) File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcinterpreter.py", line 173, in _read self._read_data() File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 137, in _read_data remaining_file_size = file_size - header_size


OverflowError: Python integer 3713717360 out of bounds for int32
Traceback (most recent call last):
  File "/opt/miniconda3/envs/pytom_tm/bin/pytom_match_template.py", line 8, in <module>
    sys.exit(match_template())
             ^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/entry_points.py", line 883, in match_template
    score_volume, angle_volume = run_job_parallel(
                                 ^^^^^^^^^^^^^^^^^
  File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/pytom_tm/parallel.py", line 157, in run_job_parallel
    raise RuntimeError(
RuntimeError: One or more of the processes stopped unexpectedly.`

This part:
` File "/opt/miniconda3/envs/pytom_tm/lib/python3.12/site-packages/mrcfile/mrcfile.py", line 137, in _read_data
    remaining_file_size = file_size - header_size
                          ~~~~~~~~~~^~~~~~~~~~~~~
OverflowError: Python integer 3713717360 out of bounds for int32`

Seems similar to the overflow referenced in the mrcfile issue you linked to. 
sroet commented 2 months ago

@frozenfas Thanks for sending me the files, I can reproduce the error and found a fix for it. (it in the mrcfile package and is very similar to https://github.com/ccpem/mrcfile/pull/61)

Will make a PR to mrcfile and see if they want the fix as well

sroet commented 2 months ago

this should be fixed once https://github.com/ccpem/mrcfile/pull/62 gets merged and mrcfile does a new release

sroet commented 2 months ago

This should be fixed with the new release of mrcfile 1.5.3 @frozenfas do you mind trying to run it again?

frozenfas commented 2 months ago

Yes, it is currently running, so I believe it is fine now. If I run into more problems I will update you. To confirm I reinstalled pytom and have mrcfile 1.5.3 and numpy 2.0.1.

sroet commented 2 months ago

Yes, it is currently running, so I believe it is fine now. If I run into more problems I will update you. To confirm I reinstalled pytom and have mrcfile 1.5.3 and numpy 2.0.1.

Perfect, I will close this issue for now, feel free to reopen if you run into any further issues with this