Open QianMuXiao opened 8 months ago
Are you using TotalSegmentator for 3d segmentation ? Pytorch's MPS backend doesn't support 3d ops yet, hence even if TotalSegmentator is doing everything right your computation would still be slow on a mac.
francescopisu Yes I used CT image data with dimensions of about (512,512,54), but before that I didn't realize that Pytorch's MPS device doesn't support 3D operations, I'll re-check the docs on that, thanks a lot!
Are you using TotalSegmentator for 3d segmentation ? Pytorch's MPS backend doesn't support 3d ops yet, hence even if TotalSegmentator is doing everything right your computation would still be slow on a mac.
I checked the Pytorch documentation in detail and after some practice the current night version of pytorch only supports MPS accelerated Conv3d but not ConvTranspose3d operations.
It seems that quite recently pytorch finally added Conv3d to the nightly version. I pushed a commit to master to allow "mps" as device argument. I did not have a chance to test it since I am not running the newest MacOs which is required for this to work.
It seems that nnUNnet also uses ConvTranspose3D which is not yet supported. So mps is not working for now.
It seems that nnUNnet also uses ConvTranspose3D which is not yet supported. So mps is not working for now.
Yes, I called the mps device parameter by modifying the source code in the Totalsegmentator package, and in the latest nightly version of the pytorch environment it prompts that pytorch does not support ConvTranspose3D under mps.
I run Totalsegmentator on M3 MAX mbp to segment only one CT image, regardless of using --fast or --rb, the error is as follows: Background workers died. Look for the error message further up! If there is none then your RAM was full and the worker was killed by the OS. Use fewer workers or get more RAM in that case!
@wxc-2020 How many workers for preprocessing ? nr_thr_resamp and nr_thr_saving My run crashed very recently because I spawned too many workers (31 to be precise) and I drained all 64 GB of RAM.
My running still fine,My Mac got only 48GB RAM and when I run the 3D Segment with full-res on CPU its slow but won
t broken and don`t need to use swap
@QianMuXiao It also depends on the specifics of your tomographic data
@francescopisu The MSD datasets I've been using lately seem to run segmentation fine
@francescopisu I didn't specifically set these two parameters, which seem to be the default; In addition, my CT images are 256256300;I still can't find the specific reason for the error.
@wxc-2020 Can you show the entire stack trace when you get the error ?
@francescopisu Thanks for your care. I have now uninstalled and installed all my conda and re-run Totalsegmentator, no longer report corresponding errors and run smoothly with the CPU(it takes 3 to 4 minutes for mbp with 128g memory to segment a CT image in --fast mode).I think the reason for the error may be that there are some conflicts in my old conda env.
It seems that nnUNnet also uses ConvTranspose3D which is not yet supported. So mps is not working for now.
I made a quick guide for building PyTorch from source with the repository's state of the yet to be merged PR implementing ConvTranspose3d in MPS from mattiaspaul. Got 3d highres cardiac chambers segmentations for a coronary 512x512x224 CTA scan in under a minute.
@francescopisu Your guide runs perfect on my MBP, Thanks a lot, it takes about 86s for task total (512x512x54 CT scan from MSD spleen dataset) on my M3Max with 48G RAM.
@francescopisu I tried using your guide on my M3 MBP and when trying "install -r requirements.txt --no-cache-dir" I get an error message saying that there is no "-r" option for install (which install = /usr/bin/install). Is there another "install" program that should be called? Thanks! Gene
@w1ebr add pip before install
@w1ebr My bad, I forgot the "pip" for "pip install". I updated the blog post as well. Thanks
@francescopisu is it possible to make my 3D slicer Using this version of pytorch?
@QianMuXiao I'm afraid that's not a trivial thing to do. I may need some support @lassoan.
Thank you!
is it possible to make my 3D slicer Using this version of pytorch
It may be just a matter of weeks until a new pytorch official version comes out that works out of the box, so probably it is not worth spending a whole lot of time with this. But, if that seems like a long time then, you can run pip_install('https://example.com/path/to/custompytorch.tar.gz')
in Slicer's Python console to install a custom pytorch build from a URL. If the TotalSegmentator extension finds that pytorch is installed then it will use that.
Should OpenMP also be installed? My build log says it wasn't found
@francescopisu maybe your quick guide commadn ‘conda create --prefix==./venv python=3.10’ should change to ‘conda create --prefix = ./venv python=3.10’?
@francescopisu Perfect! Thanks again!, it takes effective speed increase!
@francescopisu maybe your quick guide commadn ‘conda create --prefix==./venv python=3.10’ should change to ‘conda create --prefix = ./venv python=3.10’?
Yes, updated.
Just noticed there was a discussion about this here, I also got Mac GPU working a few weeks ago: https://github.com/wasserth/TotalSegmentator/issues/39#issuecomment-2007890904
Working great for me! Thanks @wasserth et al
FWIW, I installed the commit from the PR mentioned in the prior comment with: pip install git+https://github.com/pytorch/pytorch.git@3c61c525694eca0f895bb01fc67c16793226051a
Then set the device to 'mps' in my totalsegmentator call and it seems to work.
It took about 8min to get 69 ROIs on a CT sim for a prostate case, using an Apple M2 Pro on Sonoma 14.4 and python 3.10.9. This included DICOM output. I was able to import the DICOM output into my viewer and it looks pretty good to me.
I'm running TotalSegmentator on my M3Max chip (16-core CPU 40-core GPU) Macbook Pro and it's taking over 600 seconds to fully segment a 3D CT image of size 51251254, whereas it tends to take about 60 seconds on my 3070 GPU desktop. Is it possible that the latest version of totalsegmentator just uses the Macbook's CPU instead of calling the Macbook's "MPS" when using pytorch?