Open OriKovacsiKatz opened 1 month ago
I see. Thanks for reporting this @OriKovacsiKatz
The reason why it failed is because PyTorch didn't raise OOM but raised RuntimeError: Expected output.numel() <= std::numeric_limits<int32_t>::max() to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)
while I decided to only allow OOM errors:
I do not understand why it raised something else. For now, yes, just change the list to have max 8
would work. But removing line 53, 54, 56, 57 would be the real solution if no OOM happened and the error message was correct. If what really happened was OOM, but PyTorch reports this, then we report an enhancement request to PyTorch)
as suggested.
I'll keep this issue open until we figure this out.
"Legacy" tag because Napari GUI has workaround (single patch mode).
Just formatted the issue for readability.
Just in case I didn't sound encouraging, @OriKovacsiKatz you are very welcomed to check if OOM really happens in your device and then make a PR for PlantSeg and/or an issue for PyTorch. The easy way is just to stare at the terminal of PlantSeg and watch nvidia-smi
together
running plantseg example col-0_20161116 getting crash:
modified code to print details:
it crashed at batch_size=16
changed the sizes to maximal number 8 and it didn't crash
how can I fix the plant-seg/plantseg/predictions/functional/array_predictor.py line 51 so it will not crash the plantseg execution with all batch_sizes :
thanks Ori