czimaginginstitute / MotionCor3

Anisotropic correction of beam induced sample motion for cryo-electron microscopy and tomography
BSD 3-Clause "New" or "Revised" License
42 stars 2 forks source link

Seg fault when using batch processing #8

Open MTclement1 opened 11 months ago

MTclement1 commented 11 months ago

Hi, MotionCorr works perfectly on single file (although for compiling I needed the -no-pie option as others), but when using batch the program seg fault after image loading.

The command for my tests is (I aliased motionCorr) : motionCor -InMrc ./data/ -OutMrc ./MotionCor/ -Gpu 0 -Patch 5 5 -Iter 10 -Serial 1

In the output the files are found :

added: ./data/test_fram_004.mrc added: ./data/test_fram_006.mrc added: ./data/test_fram_005.mrc added: ./data/test_fram_009.mrc added: ./data/test_fram_000.mrc added: ./data/test_fram_010.mrc added: ./data/test_fram_003.mrc added: ./data/test_fram_001.mrc added: ./data/test_fram_008.mrc added: ./data/test_fram_007.mrc added: ./data/test_fram_002.mrc

But after seemingly loading 5 files it segfault. Here is the last bit of the output :

MRC file size mode: 4096 4096 20 1 Rendered size mode: 4096 4096 20 1 MRC file size mode: 4096 4096 20 1 Rendered size mode: 4096 4096 20 1 GPU 0 Allocation time: GPU ( 0.06 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 0.25 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 0.31 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 1.25 GB) 0.01 s, CPU ( 0.00 GB) 0.00 s Create buffers: total memory allocation 0.17 GB Create buffers: 0.04 seconds

MRC file size mode: 4096 4096 20 1 Rendered size mode: 4096 4096 20 1 Segmentation fault (core dumped)

Then I reduced the amount of files to 4 and it just finish with no output :

MRC file size mode: 4096 4096 20 1 Rendered size mode: 4096 4096 20 1 MRC file size mode: 4096 4096 20 1 Rendered size mode: 4096 4096 20 1 GPU 0 Allocation time: GPU ( 0.06 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 0.25 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 0.31 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 0.01 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 0 Allocation time: GPU ( 1.25 GB) 0.02 s, CPU ( 0.00 GB) 0.00 s Create buffers: total memory allocation 0.17 GB Create buffers: 0.04 seconds

Total time: 0.088412 sec

I use a RTX 3080 that is recongnized by serial EM as having 9987 MB of VRAM left which sounds right.

Anyone has an idea on what I can do ? Thanks

leetleyang commented 11 months ago

We're seeing a similar issue with batch processing:

MotionCor3 -InMrc /data/stack_folder/ -OutMrc /data/corrected/ -LogDir /data/log/ -Patch 5 5 10 -FmDose 1.1 -Kv 200 -PixSize 0.96 -SumRange 0 0 -InFmMotion 1 -Gpu 0 1 2 3 -Serial 1 -OutStar 1

Seems to adds the input files (~5000 gain-normalized MRC stacks) but then breezes through the GPU allocation messages (e.g. above) without writing any outputs.

Processes a single movie fine, but -OutStar 1 does not lead to a corresponding star file being written.

Something amiss during our compilation, perhaps?

szhengczii commented 11 months ago

@MTclement1: Sorry for the tardy response. I will look into this and let you know. @leetleyang: I will also look into this and let you know.

jonathanrd commented 9 months ago

I also have this issue, have there been any updates or workarounds?

Fengyun0101 commented 6 months ago

I also have this issue when I process .mrc files of tomo datasets with -Serial 1. end with the info as:

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 GPU 2 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s Create buffers: total memory allocation 0.37 GB Create buffers: 0.07 seconds

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 Segmentation fault (core dumped)

By the way, I succefully processed the .eer files with MotionCor3 1.1.1. Could anyone have the updates? Many thanks.

szhengczii commented 6 months ago

Hi Fengyun,

I need two movies for debugging. Is it ok share some with me? Thanks.

Best, Shawn

On Tue, Mar 12, 2024 at 12:18 PM Fengyun0101 @.***> wrote:

I also have this issue when I process .mrc files of tomo datasets with -Serial 1. end with the info as:

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 GPU 2 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s Create buffers: total memory allocation 0.37 GB Create buffers: 0.07 seconds

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 Segmentation fault (core dumped)

By the way, I succefully processed the .eer files with MotionCor3 1.1.1. Could anyone have the updates? Many thanks.

— Reply to this email directly, view it on GitHub https://github.com/czimaginginstitute/MotionCor3/issues/8#issuecomment-1992371334, or unsubscribe https://github.com/notifications/unsubscribe-auth/BBUDUPYIBUWK7WFZ6SGSWJ3YX5IJJAVCNFSM6AAAAAA64EC7NOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJSGM3TCMZTGQ . You are receiving this because you commented.Message ID: @.***>

Fengyun0101 commented 6 months ago

Hi Shawn,

Thank you so much for the reply. I uploaded 5 files as attached. The mrc files were acquired from Krios K3 and already with gain normalized. The pixel size is 2.62A. Total dose is 0.45e/A2.

Best, Feng mrc_fractions_Feng_0314.zip https://drive.google.com/file/d/1LUbWyPAdmsa-ceajJ8h_wm-WOrRVIHIz/view?usp=drive_web

On Thu, Mar 14, 2024 at 10:05 AM Shawn Zheng @.***> wrote:

Hi Fengyun,

I need two movies for debugging. Is it ok share some with me? Thanks.

Best, Shawn

On Tue, Mar 12, 2024 at 12:18 PM Fengyun0101 @.***> wrote:

I also have this issue when I process .mrc files of tomo datasets with -Serial 1. end with the info as:

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 GPU 2 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.09 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.35 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.07 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.00 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 2 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s GPU 3 Allocation time: GPU ( 0.26 GB) 0.00 s, CPU ( 0.00 GB) 0.00 s Create buffers: total memory allocation 0.37 GB Create buffers: 0.07 seconds

MRC file size mode: 5760 4092 6 0 Rendered size mode: 5760 4092 6 0 Segmentation fault (core dumped)

By the way, I succefully processed the .eer files with MotionCor3 1.1.1. Could anyone have the updates? Many thanks.

— Reply to this email directly, view it on GitHub < https://github.com/czimaginginstitute/MotionCor3/issues/8#issuecomment-1992371334>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/BBUDUPYIBUWK7WFZ6SGSWJ3YX5IJJAVCNFSM6AAAAAA64EC7NOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJSGM3TCMZTGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/czimaginginstitute/MotionCor3/issues/8#issuecomment-1997671608, or unsubscribe https://github.com/notifications/unsubscribe-auth/BG5BMII7YX4SGKSIUOAWZFTYYG4DDAVCNFSM6AAAAAA64EC7NOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXGY3TCNRQHA . You are receiving this because you commented.Message ID: @.***>

Poko18 commented 4 months ago

I have similar problem. Job gets killed during the batch process: cmd: /usr/local/MotionCor3 -InTiff dataset/ -InSuffix .tif -OutMrc 3/sum/corrected -Patch 5 5 -Gain dataset/GLP-1_gain.mrc -Gpu 0 -Kv 300 -PixSize 0.83 -FmDose 0.8 -Serial 1 -OutStar 1 -LogDir 3/logdir

Output:

Gain reference has been loaded.
DarkReference not found.

TIFF file size mode: 5760  4092  75  0
Rendered size mode: 5760  4092  75  0

GPU 0 Allocation time: GPU (  0.26 GB)   0.00 s, CPU (  0.00 GB)   0.00 s
GPU 0 Allocation time: GPU (  0.35 GB)   0.00 s, CPU (  0.00 GB)   0.00 s
GPU 0 Allocation time: GPU (  1.65 GB)   0.00 s, CPU (  0.00 GB)   0.00 s
GPU 0 Allocation time: GPU (  0.07 GB)   0.00 s, CPU (  0.00 GB)   0.00 s
GPU 0 Allocation time: GPU (  6.59 GB)   0.00 s, CPU (  0.00 GB)   0.00 s
Create buffers: total memory allocation 0.28 GB
Create buffers: 0.02 seconds

Killed

Im running on A10 GPU with 23Gb of memory. Does anyone have an idea what is going on?