ZauggGroup / DeePiCt

Pipeline for the automatic detection and segmentation of particles and cellular structures in 3D Cryo-ET data, based on deep learning (convolutional neural networks).
Apache License 2.0
29 stars 8 forks source link

Assertion error in DeePiCt_predict3d.ipynb Colab #13

Closed dmichalak closed 1 year ago

dmichalak commented 1 year ago

I've run into an assertion error while running Post-processing.

Processing tomo KAS_tomo30_bin8 Traceback (most recent call last): File "/content/DeePiCt/3d_cnn/scripts/clustering_and_cleaning.py", line 63, in <module> assert os.path.isfile(output_path) AssertionError

It looks like the output_path clustering_and_cleaning.py expects does not exist. So, line 55 is not finding the file:

tomo_output_dir, output_path = get_probability_map_path(config.output_dir, model_name, tomo_name, config.pred_class)

One odd thing I did notice was that the output from Step 3.3: "Assembling the patches" ends with ^C after, apparently, assembling the patches. Not sure if this is something going wrong.

And here is the output from each cell....

Step 1.1

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).

Step 1.2

fatal: destination path 'DeePiCt' already exists and is not an empty directory. Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Requirement already satisfied: mrcfile in /usr/local/lib/python3.8/dist-packages (1.4.3) Requirement already satisfied: numpy>=1.16.0 in /usr/local/lib/python3.8/dist-packages (from mrcfile) (1.22.4) Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ Requirement already satisfied: tensorboardX in /usr/local/lib/python3.8/dist-packages (2.6) Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from tensorboardX) (1.22.4) Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from tensorboardX) (23.0) Requirement already satisfied: protobuf<4,>=3.8.0 in /usr/local/lib/python3.8/dist-packages (from tensorboardX) (3.19.6)

Step 2.1

predict_type: membrane `--2023-03-01 20:41:22-- https://www.dropbox.com/sh/oavbtcvusi07xbh/AADm29QsXAHenTSTkASMcCk0a/3d_cnn/full_vpp_memb_model_IF4_D2_BN.pth?dl=0 Resolving www.dropbox.com (www.dropbox.com)... 162.125.81.18, 2620:100:6031:18::a27d:5112 Connecting to www.dropbox.com (www.dropbox.com)|162.125.81.18|:443... connected. HTTP request sent, awaiting response... 302 Found Location: /sh/raw/oavbtcvusi07xbh/AADm29QsXAHenTSTkASMcCk0a/3d_cnn/full_vpp_memb_model_IF4_D2_BN.pth [following] --2023-03-01 20:41:22-- https://www.dropbox.com/sh/raw/oavbtcvusi07xbh/AADm29QsXAHenTSTkASMcCk0a/3d_cnn/full_vpp_memb_model_IF4_D2_BN.pth Reusing existing connection to www.dropbox.com:443. HTTP request sent, awaiting response... 302 Found Location: https://uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com/cd/0/inline/B3f_ojD0FuC6GCOOHDltgFM5_WRrISg3B5AmPT4kXkwqU-y3b34CxjjWO5Ud3nm8AVtkYWgrepvreZHf0t3eklArAzNJlyZTRLgw5O0kR6rm5X4Gnml7uwf_YtS1iHC0AxtuuQXY1HijutUuwvSgr6slleHwTCPRpubaVm4Q9kH4Cg/file# [following] --2023-03-01 20:41:23-- https://uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com/cd/0/inline/B3f_ojD0FuC6GCOOHDltgFM5_WRrISg3B5AmPT4kXkwqU-y3b34CxjjWO5Ud3nm8AVtkYWgrepvreZHf0t3eklArAzNJlyZTRLgw5O0kR6rm5X4Gnml7uwf_YtS1iHC0AxtuuQXY1HijutUuwvSgr6slleHwTCPRpubaVm4Q9kH4Cg/file Resolving uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com (uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com)... 162.125.67.15, 2620:100:6016:15::a27d:10f Connecting to uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com (uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com)|162.125.67.15|:443... connected. HTTP request sent, awaiting response... 302 Found Location: /cd/0/inline2/B3fYBlYk24LRcxZTgXLBlVOcZvFruNoVJXMoT7q0ZRbst5re0SZT9MkjuLXwmeI7UB7o87SY1zPu1R7SmQOm0EyaZ_9jg80ul4dobL2jQB9-kJK4TU-P7oHUPOJhyWMSkkbL4MNjZVr7izXRP6CKVnkHFDGkxd56kD47HtZkL9TjFPwb_RMASyZ6TMX0B3S7zEfZlWS0PlV_AEJ9fWotBekJ9zQ26aLUThH7Iz2WfB1rGi564gj3d6nIKStstl9HSeP1FsSTHjiJpILKezeF8lTFY5hVFnTBTxL5iqo3Yvt24yH_e8ue6SyFuQ_bztqhnl3nfofYCpk3KznArW9LVxzEOmULW01uVHsCEzF5fz3RRloElifBmzvW8I-9uGLzyQ2r6JasKzKae7VMuK2d8vi2H5nWs3xqL1MqKrwnjoLX8g/file [following] --2023-03-01 20:41:24-- https://uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com/cd/0/inline2/B3fYBlYk24LRcxZTgXLBlVOcZvFruNoVJXMoT7q0ZRbst5re0SZT9MkjuLXwmeI7UB7o87SY1zPu1R7SmQOm0EyaZ_9jg80ul4dobL2jQB9-kJK4TU-P7oHUPOJhyWMSkkbL4MNjZVr7izXRP6CKVnkHFDGkxd56kD47HtZkL9TjFPwb_RMASyZ6TMX0B3S7zEfZlWS0PlV_AEJ9fWotBekJ9zQ26aLUThH7Iz2WfB1rGi564gj3d6nIKStstl9HSeP1FsSTHjiJpILKezeF8lTFY5hVFnTBTxL5iqo3Yvt24yH_e8ue6SyFuQ_bztqhnl3nfofYCpk3KznArW9LVxzEOmULW01uVHsCEzF5fz3RRloElifBmzvW8I-9uGLzyQ2r6JasKzKae7VMuK2d8vi2H5nWs3xqL1MqKrwnjoLX8g/file Reusing existing connection to uc5102227bad90654a9163bd4a0e.dl.dropboxusercontent.com:443. HTTP request sent, awaiting response... 200 OK Length: 305637 (298K) [application/octet-stream] Saving to: 'model_weights.pth'

model_weights.pth 100 %[===================>] 298.47K 319KB/s in 0.9s

2023-03-01 20:41:25 (319 KB/s) - ‘model_weights.pth’ saved [305637/305637]`

Step 2.2

Define the following information in the given variables: ID/name for the tomogram: tomo_name: KAS_tomo30_bin8 Path to the tomogram .mrc file: tomogram_path: /content/gdrive/MyDrive/AAA1_tilt030_alignbin8_full_rx.mrc Path to the mask .mrc file used for processing (if there is no mask leave it empty): mask_path: Insert text here You don't need to change the following variables: Path where the config .yaml file will be saved (you can leave the default option): user_config_file: /content/gdrive/MyDrive/DeePiCt_3d/config.yaml Path where the data .csv file will be saved (you can leave the default option): user_data_file: /content/gdrive/MyDrive/DeePiCt_3d/data.csv Path to folder where the prediction files will be saved (you can leave the default option): user_prediction_folder: /content/gdrive/MyDrive/DeePiCt_3d/ Path to folder where the intermediate files will be saved (you can leave the default option): user_work_folder: /content/work/

Step 2.3

No output

Step 3.1

tomo_name KAS_tomo30_bin8 partition_path = /content/work/testing_data/KAS_tomo30_bin8/partition.h5 Exiting, path exists. Creating snakemake pattern

Step 3.2

GPU is available Model trained under the following original settings: ModelDescriptor(batch_norm=True, box_size=64, training_date=None, decoder_dropout=0, encoder_dropout=0, depth=2, initial_features=4, log_path=None, model_name='model_weights', model_path='/content/model_weights.pth', epochs=150, old_model=None, output_classes=1, overlap=12, partition_name='train_partition', processing_tomo='filtered_tomo', retrain=False, semantic_classes=['memb'], train_split=0.8, training_set=None, testing_set=None, total_folds=None, fold=None, da_rounds=0, da_rot_angle=0, da_elastic_alpha=0, da_sigma_gauss=0, da_salt_pepper_p=0, da_salt_pepper_ampl=0, loss='Dice') file in .mrc format Segmenting tomo KAS_tomo30_bin8 GPU is available data_mean = 37.1691280434271, data_std = 428.4135638936406 The segmentation model_weights exists already. 100% 14040/14040 [00:00<00:00, 14487.11it/s] The segmentation has finished! Creating snakemake pattern .done_patterns/model_weights.KAS_tomo30_bin8.None.segmentation.done

Step 3.3

config.processing_tomo filtered_tomo file in .mrc format Assembling data from /content/work/testing_data/KAS_tomo30_bin8/partition.h5 : 100% 14040/14040 [01:17<00:00, 182.29it/s] ^C

Step 4.1

If you don't want to use the default parameters, unclick the button for default_options and define the parameters. Otherwise, the default options will be used. default_options:

threshold: 0.5 min_cluster_size: 500 max_cluster_size: 0 clustering_connectivity: 1 calculate_motl:

contact_mode:

intersection contact_distance: 0

Processing tomo KAS_tomo30_bin8 Traceback (most recent call last): File "/content/DeePiCt/3d_cnn/scripts/clustering_and_cleaning.py", line 63, in <module> assert os.path.isfile(output_path) AssertionError

JunMa11 commented 1 year ago

Hi, @dmichalak

I'm sorry to bother you. Do you know how to download the 3D sample data?

https://github.com/ZauggGroup/DeePiCt/issues/14

frosinastojanovska commented 1 year ago

Hi @dmichalak, it seems that step 3.3 failed because of a memory issue. You should run the alternative version of step 3.3. (the cell below) which is slower but requires less memory. Then the file will be assembled in step 3.3 and the error in Step 4.1 should disappear since the required file will not miss :)

dmichalak commented 1 year ago

Just to clarify, Step 3.3 is for assembling the patches and, for me, the output of this step ends with '^C'. I'm not sure if this is intended. I don't see an alternative cell for this step,

The Assertion Error is occurring for both cells of Step 4.1.

I just tried the test data and Step 3.3 completed without '^C'. It looks like my tomogram file was too big and it is indeed a memory issue that occurred in Step 3.3.

Thanks!

frosinastojanovska commented 1 year ago

Hi @dmichalak, what is the voxel size of your tomogram? I confused the steps, apologies, indeed there is no alternative cell for step 3.3. As I remember this wasn't a problem previously with the tomograms used for testing, so the question is to just confirm that the data is approximately the same pixel size as the trained models

dmichalak commented 1 year ago

8.66 A/px. I binned it to 17.33 A/px and the notebook was able to process everything. I actually completely missed the requirement that the tomograms should be at the same pixel size!