aigzhusmart / Slim-UNETR

An Efficient, High-Quality 3D Segmentation for Medical Image Analysis with Constrained Computational Resources
MIT License
33 stars 4 forks source link

training failure with multiple GPUs #3

Open carsonpurnell opened 5 months ago

carsonpurnell commented 5 months ago

Trying to utilize a pair of GPUs on a single device throws a consistent error. Using the hepatic vessel dataset that was linked in the readme - train.py was altered to make both devices visible and run both processes. Error log is attached. Running on one GPU finishes running epochs before a different critical error is thrown.

Full log attached. I don't see a reference to a file from slim-UNETR itself, so I can't guess where the error is coming from. multiGPU error.txt