torrvision / crfasrnn

This repository contains the source code for the semantic image segmentation method described in the ICCV 2015 paper: Conditional Random Fields as Recurrent Neural Networks. http://crfasrnn.torr.vision/
Other
1.34k stars 462 forks source link

why the blobs[0] is batch_number=1, channel=1, channels_ X channels_ matrix?? What is channels_ in image #131

Closed machanic closed 7 years ago

machanic commented 7 years ago

I notice that this->blobs[0].reset(new Blob(1, 1, channels, channels)); this->blobs[1].reset(new Blob(1, 1, channels, channels)); because blobs[0] - spatial kernel weights, blobs[1] - bilateral kernel weights, blobs_[2] - compatability matrix so why the blobs[0] is batchnumber=1, channel=1, channels X channels_ matrix??

sadeepj commented 7 years ago

We have done this to make the implementation with the BLAS API easier. Essentially, blobs[0] and blobs[1] each contains a 21 x 21 matrix (blob dimensions 1 x 1 x 21 x 21).

Only diagonal entries of blobs[0] and blobs[1] are absolutely needed since off-diagonal entries can be learned in the compatibility transform stage (blobs[2]) as well. Diagonal entries of blobs[0] and blobs_[1] (there are 21 diagonal entries in each) represent class-specific kernel weights as described in Section 4.3 of http://www.robots.ox.ac.uk/~szheng/papers/CRFasRNN.pdf.