opencv / opencv_contrib

Repository for OpenCV's extra modules
Apache License 2.0
9.39k stars 5.75k forks source link

cv2.ximgproc.segmentation.createGraphSegmentation seems to handle wrongly float32 images (NumPy) #3544

Open vadimkantorov opened 1 year ago

vadimkantorov commented 1 year ago

From the code it seems that the images get cast to float32 ([0.0, 1.0] range, right?) anyway: https://github.com/opencv/opencv_contrib/blob/daaf645151b7afbafabfacf71ae2880cf6fc904e/modules/ximgproc/src/graphsegmentation.cpp#L165, so why is not working if I'm passing directly a float32 image?

Is it supporting only uint8 inputs?

What is the effect of passing float32 and int32 images?

Thanks!

import numpy as np
import cv2

gs = cv2.ximgproc.segmentation.createGraphSegmentation(0.8, 150, 100)

np.random.seed(0)
img_float32 = np.random.rand(200, 300, 3)
img_float32_255 = img_float32 * 255
img_uint8 = (img_float32 * 255).astype('uint8')
img_int32 = (img_float32 * 255).astype('int32')

print(gs.processImage(img_uint8).max()) # 97
print(gs.processImage(img_float32).max()) # 0
print(gs.processImage(img_float32_255).max()) # 104
print(gs.processImage(img_int32).max()) # 97
Kumataro commented 1 year ago

Hi, it is not opencv bug. And please could you check data format handling ?

From the code it seems that the images get cast to float32 ([0.0, 1.0] range, right?) anyway: , so why is not working if I'm passing directly a float32 image?

convertTo() converts Mat's format, but not value-scaling if both alpha and beta are not given.

https://docs.opencv.org/4.8.0/d3/d63/classcv_1_1Mat.html#adf88c60c5b4980e05bb556080916978b

So I think OpenCV process segmentation with img_float32, that is almost black only image.

Is it supporting only uint8 inputs?

No. There are no limitation in the opencv document.

https://docs.opencv.org/4.8.0/dd/d19/classcv_1_1ximgproc_1_1segmentation_1_1GraphSegmentation.html#a13a3603cb371d740c3c4b01d63553d90

What is the effect of passing float32 and int32 images?

All input images (including 64bit float images and 32bit signed/unsigned integer images) are converted to 32bit float images. Therefore, the number of significant digits is reduced for some images. Thjs effects may be small.

convertTo() sample code

#include <opencv2/core.hpp>
#include <iostream>

int main(void)
{
  cv::Mat p = cv::Mat(1,1,CV_8UC1) ;
  p.at<uchar>(0) = 128;
  cv::Mat pF;
  p.convertTo(pF, CV_32F);
  pF.at<float>(0) += 0.5;
  cv::Mat p2;
  p.convertTo(p2, CV_8U);

  std::cout << "p  = "  << p << std::endl;
  std::cout << "pF = " << pF << std::endl;
  std::cout << "p2 = "  << p << std::endl;

  return 0;
}
$ ./a.out
p  = [128]
pF = [128.5]
p2 = [128]
vadimkantorov commented 1 year ago

Thank you for your detailed response!

I did forget to explain that I was needing GraphSegmentation in the context of studying SelectiveSearch impl in OpenCV.

Docs for https://docs.opencv.org/4.x/d6/d6d/classcv_1_1ximgproc_1_1segmentation_1_1SelectiveSearchSegmentation.html#details also do not mention about the input format, but in https://github.com/opencv/opencv_contrib/blob/master/modules/ximgproc/src/selectivesearchsegmentation.cpp we can see that the histograms are expected to treat range [0, 255].

As cvtColor is applied in https://github.com/opencv/opencv_contrib/blob/master/modules/ximgproc/src/selectivesearchsegmentation.cpp, I did consult the docs https://docs.opencv.org/4.x/d8/d01/group__imgproc__color__conversions.html#ga397ae87e1288a81d2363b61574eb8cab prior to posting. It suggests that conventionally CV32F images are considered to be in [0.0, 1.0]. And the docs for GraphSegmentation do not mention in what format it is expecting RGB images: https://docs.opencv.org/4.x/dd/d19/classcv_1_1ximgproc_1_1segmentation_1_1GraphSegmentation.html#a13a3603cb371d740c3c4b01d63553d90

So IMO it would just be a great addition if SelectiveSearchSegmentation (and maybe GraphSegmentation) docs had a note on the expected input range (especially for CV32F inputs)

Kumataro commented 1 year ago

Hi, thank you for your reply.

As first, this feature doesn't seem to be actively maintained from 2016. And there are no test code, so I think it is hard to fix aggresivly. (I just gave a comment because I was a little worried about the interpretation of convertTo()).

  1. It is probably a bug in the specification that there is no limit on the input image format.
  2. It is probably a bug in the implementation that the lack of scale processing (x256) when inputting CV_32F, CV_64F images.y
  3. When fixing it, very small backcompatibility is lost. However its effect may be small.

Also currently to multuiply CV_32F image with 255 in user-application is workaround(as you said, thank you!).

import numpy as np
import cv2

gs = cv2.ximgproc.segmentation.createGraphSegmentation(0.8, 150, 100)

np.random.seed(0)
img_float32 = np.random.rand(200, 300, 3)
img_float32_255 = img_float32 * 255 
print(gs.processImage(img_float32_255).max()) # 104
Kumataro commented 1 year ago

Hello, I made a pull request, and I believe it will fix this issue.

And please could you tell me if supporting CV_8S/CV_16S/... are nessesary ? Those are not contains thie PR. Sometimes those ( and CV_32F/64F) contains negative value pixels, but it is hard to map into [0,255].

vadimkantorov commented 1 year ago

I don't mind if the fix is just a clear mention in the docs of how exactly CV32F inputs are processed currently :)

About other dtypes - I don't know. Again, IMO maybe just a mention in the docs of how exactly they are processed currently is enough...

Kumataro commented 1 year ago

The pull request has been changed to WIP status because it doesn't meet your needs.

Kumataro commented 11 months ago

Hello, I improved additional patch to welcome OpenCV4.9.0 without WIP. I think this will probably accomplish the purpose.