Closed masadcv closed 10 months ago
There's a quick fix I've been meaning to commit and then there's optimisation of the PHL message passing which is a longer job I'm working on in the background. Optimisation has mainly been focused on the GPU implementation with the CPU as a fall back, but I think it is reasonable to expect both be at a high performance.
Naive first question but is the c++ code compiled with optimisation on? I can't see anything like -O2 or -O3 in setup.py but I guess it may come from elsewhere.
We compile the C++ extention with torch's setup tools wrapper. I believe this handles these things, off the top of my head I think its -O2. This PR #2261 implements the quick fix I aluded to earlier. While I have not tested it against Simple CRF, it does now run at the same order of magnitude as the crf as rnn implementation, and produce identical (by eye) results. When the JIT system is in I'll move the PHL over to there and make my optimisations. The main one I've planned is to seperate the constuction of the lattice from the application of it. As the CRF iterates over the same PHL filter with the same features, this will mean we only need to construct it once.
Nice. I guess your comparison agains crf-as-rnn is in 2D. I guess crf-as-rnn works on 2D only out of the box, right? It would be worth checking against SimpleCRF in 3D especially.
I expect the runtime of SimpleCRF in 2D to be similar to crf-as-rnn.
The crf-as-rnn implementation use the code from Philipp Krähenbühl for the PHL: https://github.com/sadeepj/crfasrnn_pytorch/blob/master/crfasrnn/permutohedral.h but does the outer loop in python.
In 2D, SimpleCRF wraps the entire CRF code from Philipp Krähenbühl which includes the same PHL code: https://github.com/HiLab-git/SimpleCRF/tree/master/dependency/densecrf
In 3D, SimpleCRF warps Kostas Kamnitsas's extention of Philipp Krähenbühl CRF code: https://github.com/HiLab-git/SimpleCRF/tree/master/dependency/densecrf3d
closing because of inactivity
Describe the bug I am running MONAI's CRF implementation on CPU on a 3D volume of size (120, 150, 100). It takes 177.5296 sec to run on CPU using MONAI's implementation The same can be achieved in 5.8184 sec using SimpleCRF's implementation from: https://github.com/HiLab-git/SimpleCRF
I have setup a test script to replicate this here: https://gist.github.com/masadcv/84f1bc9f505056ea8f4290d14a002d2a
It also seems the case that the MONAI's implementation takes significantly more memory on CPU as compared to SimpleCRF. Not sure if that is expected, but may be worth investigating if possible.
To Reproduce Steps to reproduce the behavior:
BUILD_MONAI=1 pip -q install git+https://github.com/Project-MONAI/MONAI#egg=monai
pip install simplecrf nibabel wget
Expected behavior I expect the two implementations (MONAI CRF vs SimpleCRF) to be in the same/similar ballpark in terms of execution time. At the moment, MONAI's implementation seems orders of magnitude slower.
Environment
Ensuring you use the relevant python executable, please paste the output of:
cc: @charliebudd @tvercaut