netw0rkf10w / CRF

Conditional Random Fields
Apache License 2.0
23 stars 2 forks source link

Forward pass issue #4

Closed WeiChihChern closed 2 years ago

WeiChihChern commented 2 years ago

Thanks for sharing the great work.

I've some question regarding the self.crf(x, logits) usage from your tutorial code.

Does x is a normalized image tensors? Or should it be in a range of 0-255?

Also, is the pre-trained CRF available by any chance? Thanks a lot.

netw0rkf10w commented 2 years ago

Hi @WeiChihChern. Thanks for your interest.

I've some question regarding the self.crf(x, logits) usage from your tutorial code.

Does x is a normalized image tensors? Or should it be in a range of 0-255?

Yes x should be a normalized image tensor. Typically the normalization is done using the standard ImageNet mean and std:

T.Normalize(mean=[0.485, 0.456, 0.406],
                                    std=[0.229, 0.224, 0.225])

Also, is the pre-trained CRF available by any change? Thanks a lot.

I am going to release the full segmentation code with all pre-trained weights next week (around Match 10th). Would that be too late for you? Not sure if you're working on ECCV but if you need it urgently I can make an effort and release the pre-trained CRF weights tomorrow together with instructions (but if you could wait until next week then it'd be great). Please let me know.

WeiChihChern commented 2 years ago

@netw0rkf10w Thank you so much for the reply and the information.

I am actually working on a weakly supervised learning project which I only need the pre-trained weights for the CRF itself as there's no GT for the CRF to learn. I am not catching any deadline, but it would be great if an unofficial link to the CRF weights, and the CRF parameters setup to the corresponding weights can be provided.

In addition, as I mentioned there's no ground truth for CRF in my case. I wonder does the CRF has certain level of performance even when it is untrained? That it can still improve prediction masks based on the input colors and spatial info that the traditional CRF does?

Thank you for your kind reply.

netw0rkf10w commented 2 years ago

@WeiChihChern If there's no GT then I would suggest to first try the default values for the weights:

crf = CRF.DenseGaussianCRF(classes=21,
                alpha=160,
                beta=0.05,
                gamma=3.0,
                spatial_weight=1.0,
                bilateral_weight=1.0,
                compatibility=1.0,
                init='potts',
                solver='fw',
                iterations=5,
                params=params)

These values work reasonably well for PASCAL VOC. The pre-trained weights that I will release were obtained by training jointly (end-to-end) with DeepLabv3 or DeepLabv3+ so they might not perform well for your CNN and your task (but who know?). If you work on another dataset then you would probably need to tune the parameters.

WeiChihChern commented 2 years ago

Thanks for the suggestion, I will try the default values first. The experiment will be conducting with VOC first, then apply it to other datasets. Thanks for the great work, I am actually able to use your code in my project for training now.

netw0rkf10w commented 2 years ago

Great to hear that! Please do not hesitate to let me know if you encounter any issues with the code. And stay tuned for the pre-trained weights!

WeiChihChern commented 2 years ago

@netw0rkf10w A quick question regarding the output of the CRF layer self.crf(x, logits). Is the output of the crf contains probabilities/likelihoods to each pixel?

And to double check, the logits for crf expects values from [0,1] right?

netw0rkf10w commented 2 years ago

@WeiChihChern In the literature, "logits" typically refers to the raw output of a network before taking softmax. Here I adopted the same terminology.

A quick question regarding the output of the CRF layer self.crf(x, logits). Is the output of the crf contains probabilities/likelihoods to each pixel?

By default, the output of self.crf(x, logits) are also logits that take values in (-inf, +inf). You can view them as the log-likelihood to each pixel. If you want to obtain probabilities, just feed the output to a softmax layer, or alternatively you can set output_logits=False when defining the CRF layer:

crf = CRF.DenseGaussianCRF(classes=21, output_logits=False, params=params)

In the future I will move the output_logits argument to the forward function to make it more flexible.

And to double check, the logits for crf expects values from [0,1] right?

As said above, the input logits to CRF should be logits and not probabilities. (Later I will add an argument to allow probability input.)

netw0rkf10w commented 2 years ago

@WeiChihChern The CRF weights for our best model (DeepLabv3+ Euclidean-FW CRF, 88.0% mIoU on PASCAL VOC) can be found here: best.crf.zip. Using the weights is simple:

params = CRF.FrankWolfeParams(scheme='fixed', stepsize=1.0, regularizer='l2', lambda_=1.0, x0_weight=0.6)
crf = CRF.DenseGaussianCRF(classes=21, solver='fw', iterations=5, params=params)
state_dict = torch.load('best.crf', map_location=lambda storage, loc: storage)
crf.load_state_dict(state_dict)

Please let me know if you encounter any issues or if you have questions. Cheers!

WeiChihChern commented 2 years ago

@netw0rkf10w Big thanks to the response and the weights file. The weights can be loaded successfully from my end.

From the paper, I noticed the Cityscapes dataset was also experimented. It would be great if you can also release the weights from the Cityscape (afterward) as I also use the VOC for my project. Using the VOC for weakly supervised project with pre-trained VOC weights might not be a good idea in my case.

I've no further questions for now, closing this issue.

Thanks a lot!