3dem / relion

Image-processing software for cryo-electron microscopy
https://relion.readthedocs.io/en/latest/
GNU General Public License v2.0
442 stars 196 forks source link

BUG in RELION 5: Denoise fails with an assert command #1159

Closed Amy1809 closed 1 month ago

Amy1809 commented 1 month ago

I am following the Relion 5 pipeline starting from the very beginning. When I proceed to the frist step (training setp) in denoising, Relion5 gives me an error saying that the it could not find the training model in my denoise job output. I have cryoCARE installed, and I have set 'Generate tomograms for denoising' to Yes in both motion correction and tomogram reconstruction stages, so I don't know why this happens.

Environment:

Job options:

Error message:


/bin/sh: 1: cryoCARE_train.py: Permission denied
╭───────────────────── Traceback (most recent call last) ──────────────────────╮
│ in cryoCARE_train:2                                                          │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │                        gpu = [0]                                         │ │
│ │ number_training_subvolumes = 1200                                        │ │
│ │           output_directory = PosixPath('Denoise/job039')                 │ │
│ │           pipeline_control = PosixPath('Denoise/job039')                 │ │
│ │       subvolume_sidelength = 72                                          │ │
│ │         tomogram_star_file = PosixPath('Tomograms/job038/tomograms.star… │ │
│ │         training_tomograms = 'Position_1:Position_1_2:Position_2_2:Posi… │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
│                                                                              │
│ /home/anaconda3/envs/relion-5.0/lib/python3.10/site-packages/tomogra │
│ phy_python_programs/_utils/relion.py:75 in pipeline_job                      │
│                                                                              │
│   72 │   │   │   job_directory.mkdir(parents=True, exist_ok=True)            │
│   73 │   │   try:                                                            │
│   74 │   │   │   pipeline_directory = kwargs.pop(PIPELINE_CONTROL_KEYWORD_AR │
│ ❱ 75 │   │   │   func(*args, **kwargs)                                       │
│   76 │   │   │   if job_directory is not None and pipeline_directory is not  │
│   77 │   │   │   │   write_job_success_file(job_directory)                   │
│   78 │   │   except BaseException:                                           │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │               args = ()                                                  │ │
│ │               func = <function cryoCARE_train at 0x7ea90d26c8b0>         │ │
│ │      job_directory = PosixPath('Denoise/job039')                         │ │
│ │             kwargs = {                                                   │ │
│ │                      │   'tomogram_star_file':                           │ │
│ │                      PosixPath('Tomograms/job038/tomograms.star'),       │ │
│ │                      │   'output_directory':                             │ │
│ │                      PosixPath('Denoise/job039'),                        │ │
│ │                      │   'training_tomograms':                           │ │
│ │                      'Position_1:Position_1_2:Position_2_2:Position_3_3… │ │
│ │                      │   'number_training_subvolumes': 1200,             │ │
│ │                      │   'subvolume_sidelength': 72,                     │ │
│ │                      │   'gpu': [0]                                      │ │
│ │                      }                                                   │ │
│ │ pipeline_directory = PosixPath('Denoise/job039')                         │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
│                                                                              │
│ /home/anaconda3/envs/relion-5.0/lib/python3.10/site-packages/tomogra │
│ phy_python_programs/denoise/cryoCARE/train.py:147 in cryoCARE_train          │
│                                                                              │
│   144 │   │   console.log(f'Denoising model can be found in {output_director │
│   145 │   else:                                                              │
│   146 │   │   e = f'Could not find denoise model ({MODEL_NAME}.tar.gz) in {o │
│ ❱ 147 │   │   raise RuntimeError(e)                                          │
│   148 │                                                                      │
│   149 │   console.save_html(str(output_directory / 'log.html'), clear=False) │
│   150 │   console.save_text(str(output_directory / 'log.txt'), clear=False)  │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │                        cmd = 'cryoCARE_train.py --conf                   │ │
│ │                              Denoise/job039/external/training/train_con… │ │
│ │                          e = 'Could not find denoise model               │ │
│ │                              (denoising_model.tar.gz) in Denoise/job039. │ │
│ │                              Trainin'+20                                 │ │
│ │                 even_tomos = [                                           │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_2… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_3… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_6… │ │
│ │                              ]                                           │ │
│ │                global_star = │   rlnTomoName  ...                        │ │
│ │                              rlnTomogramHProjectionalf2                  │ │
│ │                              0    Position_1  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              1  Position_1_2  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              2  Position_2_2  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              3  Position_3_3  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              4    Position_6  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                                                                          │ │
│ │                              [5 rows x 17 columns]                       │ │
│ │                        gpu = [0]                                         │ │
│ │ number_training_subvolumes = 1200                                        │ │
│ │                  odd_tomos = [                                           │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_2… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_3… │ │
│ │                              │                                           │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_6… │ │
│ │                              ]                                           │ │
│ │           output_directory = PosixPath('Denoise/job039')                 │ │
│ │       subvolume_sidelength = 72                                          │ │
│ │            tilt_series_dir = PosixPath('Denoise/job039/tilt_series')     │ │
│ │               tomogram_dir = PosixPath('Denoise/job039/tomograms')       │ │
│ │         tomogram_star_file = PosixPath('Tomograms/job038/tomograms.star… │ │
│ │          train_config_json = {                                           │ │
│ │                              │   'train_data':                           │ │
│ │                              'Denoise/job039/external/training',         │ │
│ │                              │   'epochs': 100,                          │ │
│ │                              │   'steps_per_epoch': 200,                 │ │
│ │                              │   'batch_size': 16,                       │ │
│ │                              │   'unet_kern_size': 3,                    │ │
│ │                              │   'unet_n_depth': 3,                      │ │
│ │                              │   'unet_n_first': 16,                     │ │
│ │                              │   'learning_rate': 0.0004,                │ │
│ │                              │   'model_name': 'denoising_model',        │ │
│ │                              │   'path': 'Denoise/job039',               │ │
│ │                              │   ... +2                                  │ │
│ │                              }                                           │ │
│ │     train_data_config_json = {                                           │ │
│ │                              │   'even': [                               │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_2… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_3… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_6… │ │
│ │                              │   ],                                      │ │
│ │                              │   'odd': [                                │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_1… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_2… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_3… │ │
│ │                              │   │                                       │ │
│ │                              'Tomograms/job038/tomograms/rec_Position_6… │ │
│ │                              │   ],                                      │ │
│ │                              │   'patch_shape': [72, 72, 72],            │ │
│ │                              │   'num_slices': 1200,                     │ │
│ │                              │   'split': 0.9,                           │ │
│ │                              │   'tilt_axis': 'Y',                       │ │
│ │                              │   'n_normalization_samples': 120,         │ │
│ │                              │   'path':                                 │ │
│ │                              'Denoise/job039/external/training',         │ │
│ │                              │   'overwrite': 'True'                     │ │
│ │                              }                                           │ │
│ │               training_dir = PosixPath('Denoise/job039/external/trainin… │ │
│ │         training_tomograms = [                                           │ │
│ │                              │   'Position_1',                           │ │
│ │                              │   'Position_1_2',                         │ │
│ │                              │   'Position_2_2',                         │ │
│ │                              │   'Position_3_3',                         │ │
│ │                              │   'Position_6'                            │ │
│ │                              ]                                           │ │
│ │    training_tomograms_star = │   rlnTomoName  ...                        │ │
│ │                              rlnTomogramHProjectionalf2                  │ │
│ │                              0    Position_1  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              1  Position_1_2  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              2  Position_2_2  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              3  Position_3_3  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                              4    Position_6  ...                        │ │
│ │                              Tomograms/job038/projections/rec_Position_… │ │
│ │                                                                          │ │
│ │                              [5 rows x 17 columns]                       │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Could not find denoise model (denoising_model.tar.gz) in 
Denoise/job039. Training has likely failed.

And here is the .out file

 Beginning to train denoise model.                         train.py:97
/bin/sh: 1: cryoCARE_extract_train_data.py: Permission denied

Any suggestions or advice will be helpful.

EuanPyle commented 1 month ago

Looks like a permissions issue ('/bin/sh: 1: cryoCARE_train.py: Permission denied' being the clue). The 'could not find denoise model' error message usually just means cryoCARE has failed to run. Try running cryoCARE outside of RELION and see what happens:

cryoCARE_train.py --conf Denoise/jobXYZ/external/train_config.json

(or wherever the train_config.json is)

If it still won't work, it's an issue with your cryoCARE installation/permissions (what I suspect)

Amy1809 commented 1 month ago

Looks like a permissions issue ('/bin/sh: 1: cryoCARE_train.py: Permission denied' being the clue). The 'could not find denoise model' error message usually just means cryoCARE has failed to run. Try running cryoCARE outside of RELION and see what happens:

cryoCARE_train.py --conf Denoise/jobXYZ/external/train_config.json

(or wherever the train_config.json is)

If it still won't work, it's an issue with your cryoCARE installation/permissions (what I suspect)

Thank you for your reply. I have tried running cryoCARE outside RELION and it worked. I carried on using this trained model to run CryoCARE predict in RELION, which worked but I noticed something in the output file given by RELION:

[07:56:44] Generating denoised tomograms                          predict.py:108
2024-07-11 07:56:47.623123: W tensorflow/stream_executor/gpu/asm_compiler.cc:63] Running ptxas --version returned 256
2024-07-11 07:56:47.696499: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: ptxas exited with non-zero error code 256, output: 
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
Looking for GPU with ID: 0
GPU 0 successfully found
Loading network weights from 'weights_best.h5'.
(168, 648, 648, 1)
100%|██████████| 8/8 [00:08<00:00,  1.05s/it]
['' '' '' '' '' '' '' ''
 'cryoCARE                                                11-Jul-24  07:57:12     '
 '']
Loading network weights from 'weights_best.h5'.
(168, 648, 648, 1)
100%|██████████| 8/8 [00:08<00:00,  1.04s/it]
['' '' '' '' '' '' '' ''
 'cryoCARE                                                11-Jul-24  07:57:22     '
 '']
Loading network weights from 'weights_best.h5'.
(168, 648, 648, 1)
100%|██████████| 8/8 [00:08<00:00,  1.03s/it]
['' '' '' '' '' '' '' ''
 'cryoCARE                                                11-Jul-24  07:57:32     '
 '']
Loading network weights from 'weights_best.h5'.
(168, 648, 648, 1)
100%|██████████| 8/8 [00:08<00:00,  1.04s/it]
['' '' '' '' '' '' '' ''
 'cryoCARE                                                11-Jul-24  07:57:43     '
 '']
Loading network weights from 'weights_best.h5'.
(168, 648, 648, 1)
100%|██████████| 8/8 [00:08<00:00,  1.05s/it]
['' '' '' '' '' '' '' ''
 'cryoCARE                                                11-Jul-24  07:57:53     '
 '']
[07:57:54] Denoised tomograms successfully generated, finalising  predict.py:116
           metadata             

There is not error message. By looking at the beginning of the output file, it says a non-zero error code 256, which I don't know what it means. I used this denoised tomograms for particle picking, the contrast has improved but was not very good.

EuanPyle commented 1 month ago

good that it works now. Don't worry about that error message. Tensorflow tends to throw out loads of that type of message always, as long as it has worked it should be good.

As an alternative, maybe try IsoNet, I probably like this denoising software more than CryoCARE

Amy1809 commented 1 month ago

good that it works now. Don't worry about that error message. Tensorflow tends to throw out loads of that type of message always, as long as it has worked it should be good.

As an alternative, maybe try IsoNet, I probably like this denoising software more than CryoCARE

I will try that, thanks so much on the help! :)