Closed LalithShiyam closed 1 year ago
Update: correctly executed the nnUnet example of Intel's Neural Compressor on the Brats dataset (Brain tumor challenge). Attached are the results on my machine (50% faster) Will try it on Moose next.
Great @davidiommi, hope you have the moose-models! Also, are there any improvements in the memory usage? Meaning the memory usage of the pruned models < than memory usage of original ones...? Just curious.
Is there any update on this? Would be quite useful.
I had a look at the example link (https://github.com/intel/neural-compressor/tree/master/examples/pytorch/image_recognition/3d-unet/quantization/ptq/eager), but I don't fully understand the required steps for the tuning. Does the tuning process of the models dependend on the data? Or why does the following command need the preprocessed_data_dir
as an argument?
python run.py --model_dir=build/result/nnUNet/3d_fullres/Task043_BraTS2019/nnUNetTrainerV2__nnUNetPlansv2.mlperf.1 --backend=pytorch --accuracy --preprocessed_data_dir=build/preprocessed_data/ --mlperf_conf=./mlperf.conf --tune
@chris-clem I think we don't need this, @dhaberl implemented an alternative solution for reducing the memory overhead. This should solve most of the problems. I have just removed it (now) from the main code base because it was causing some compatibility issues. This should be solved by December ππΌππ½
We don't need this anymore! moosev2 is remarkably fast than the first version! thanks to @Keyn34 !
Problem
Inference is a tad bit slow when it comes to large datasets.
Solution Performance gains can be achieved by using Intel's Neural Compressor: https://github.com/intel/neural-compressor/tree/master/examples/pytorch/image_recognition/3d-unet/quantization/ptq/eager. And Intel has already provided an example on how to do so. So we just need to implement this for getting a lean model (still need to check the performance gains)
*Alternate solution
Is to bring in a fast resampling function (torch or others...)