Closed KaterinaKrejci231054 closed 1 week ago
tried to modify the
patch_size
parameter (original: [128, 192, 96]) in thennUNetPlans.json
file tomedian_image_size_in_voxels.
Okay. Based on our today's in-person discussion, I thought that you were modifying the patch size only along the S-I axis. (to ensure that the model always has the context about all the rootlet levels). But based on the screenshot, you're actually changing all the axes. So maybe the problem might be indeed related to memory issue.
Agree with Jan's comment about the memory issue. The patch size might be too big! AND, more importantly, the patch-size you chose is not divisible by 2**x
where x=3, 4, or 5
. Usually, patch sizes are divided by 2 multiple times depending on the number of layers in nnunet (maybe 4 or 5) during training so it's usually good to ensure that the patch size you choose are divisible by 2**4 (=16)
or 2**5 (=32)
fyi I manually modified the patch_size
for lumbar model training and training has started; details: https://github.com/ivadomed/model-spinal-rootlets/issues/67#issuecomment-2252641123
Thanks for the suggestions and for the help @valosekj and @naga-karthik - I tried to modify only the SI patch size - with the value 368 (23 16) in SI it crashed again because of memory, so I tried a smaller multiple - 352 (22 16) and with that it started to train correctly.
nnUNetv2 problem with changing patch_size
Based on the information from the Ivadomed meeting, I took the following steps with hc-leipzig-7t-mp2rage dataset:
nnUNetv2_plan_and_preprocess
and then tried to modify thepatch_size
parameter (original: [128, 192, 96]) in thennUNetPlans.json
file tomedian_image_size_in_voxels.
Then I tried to run
nnUNetv2_train
with the modifiednnUNetPlans.json
file. This caused the error (see below).Then I changed it back to the original patch_size and the training started correctly, so the problem will be probably with changed
patch_size
.@naga-karthik and @valosekj, have you had a similar experience with nnUNet training, please? Do you have any suggestions for how to handle this error, please?
error
```python `Traceback (most recent call last): File "/home/ge.polymtl.ca/p120942/.conda/envs/nnunet/bin/nnUNetv2_train", line 8, in