ScanTailor-Advanced / scantailor-advanced

ScanTailor Advanced is the version that merges the features of the ScanTailor Featured and ScanTailor Enhanced versions, brings new ones and fixes.
GNU General Public License v3.0
189 stars 7 forks source link

[Feature request] binarizeEdgePlus #47

Closed zvezdochiot closed 1 year ago

zvezdochiot commented 1 year ago

Hi @vigri .

/**
 * \brief Image binarization using EdgePlus local/global thresholding method.
 *
 * EdgePlus, zvezdochiot 2023. "Adaptive/global document image binarization".
 */
BinaryImage binarizeEdgePlus(const QImage& src, QSize windowSize, double k = 0.34);
BinaryImage binarizeEdgePlus(const QImage& src, const QSize windowSize, const double k) {
  if (windowSize.isEmpty()) {
    throw std::invalid_argument("binarizeSauvola: invalid windowSize");
  }

  if (src.isNull()) {
    return BinaryImage();
  }

  QImage gray(toGrayscale(src));
  const int w = gray.width();
  const int h = gray.height();

  IntegralImage<uint32_t> integralImage(w, h);

  uint8_t* grayLine = gray.bits();
  const int grayBpl = gray.bytesPerLine();

  for (int y = 0; y < h; ++y, grayLine += grayBpl) {
    integralImage.beginRow();
    for (int x = 0; x < w; ++x) {
      const uint32_t pixel = grayLine[x];
      integralImage.push(pixel);
    }
  }

  const int windowLowerHalf = windowSize.height() >> 1;
  const int windowUpperHalf = windowSize.height() - windowLowerHalf;
  const int windowLeftHalf = windowSize.width() >> 1;
  const int windowRightHalf = windowSize.width() - windowLeftHalf;

  grayLine = gray.bits();
  for (int y = 0; y < h; ++y) {
    const int top = std::max(0, y - windowLowerHalf);
    const int bottom = std::min(h, y + windowUpperHalf);  // exclusive
    for (int x = 0; x < w; ++x) {
      const int left = std::max(0, x - windowLeftHalf);
      const int right = std::min(w, x + windowRightHalf);  // exclusive
      const int area = (bottom - top) * (right - left);
      assert(area > 0);  // because windowSize > 0 and w > 0 and h > 0
      const QRect rect(left, top, right - left, bottom - top);
      const double windowSum = integralImage.sum(rect);

      const double rArea = 1.0 / area;
      const double mean = windowSum * rArea;
      const double origin = grayLine[x];
      // edge = I / blur (shift = -0.5) {0.0 .. >1.0}, mean value = 0.5
      const double edge = (origin + 1) / (mean + 1)  - 0.5;
      // edgeplus = I * edge, mean value = 0.5 * mean(I)
      const double edgeplus = origin * edge;
      // return k * edgeplus + (1 - k) * I
      double retval = k * edgeplus + (1.0 - k) * origin;
      // trim value {0..255}
      retval = (retval < 0.0) ? 0.0 : (retval < 255.0) ? retval : 255.0;
      grayLine[x] = (int)retval;
    }
    grayLine += grayBpl;
  }
  return BinaryImage(src, BinaryThreshold::otsuThreshold(gray));
}  // binarizeEdgePlus

See also:

ghost commented 1 year ago

@zvezdochiot Thanks for the issue and the pull request, could you please explain what is binarizeEdgePlus?

Oh, and sorry for taking a long time to reply.

zvezdochiot commented 1 year ago

Hi @lightsilverberryfox .

I specifically cited illustrations in the description PR.

This PR contains two thresholds from a special class of thresholds: This is the global clipping on the modified image. Since there are no rules for changing the original image, the thresholds of this class can be given completely arbitrary properties that differ from classical thresholds. In this particular PR, one threshold emulates the blending of contours to the original image, the second - the curvature of the color scale with local gradient enhancement. The results of these thresholds, depending on the parameters, may slightly differ from Otsu, or may differ greatly. And in a certain sense they can "borrow" the properties of local thresholds.

These two thresholds are difficult to describe. But one thing is for sure: If you want to make a BW book regardless of the source material, then these two thresholds will allow you to get the minimum bad result.

Additional illustration: comparison of Otsu and EdgePlus (ws:5 c:0.99): 8b6c22b607ee5f25d2466b1b6524f052

mara004 commented 1 year ago

Just for cross-reference, the linked PR is #48. Sorry for interrupting.