DanBloomberg / leptonica

Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The official github repository for Leptonica is: danbloomberg/leptonica. See leptonica.org for more documentation.
Other
1.74k stars 387 forks source link

When trying to load TIFF images with pixRead it always fails #657

Closed Sicos1977 closed 1 year ago

Sicos1977 commented 1 year ago

I build leptonica 1.83 with tiff support

image

But when trying to read a tiff with pixRead it always returns a null pointer (I'm using leptonica from a C# .NET project). When using version 1.82 it works without any problems.

image

zdenop commented 1 year ago

Can you share also tiff file?

Sicos1977 commented 1 year ago

You can use any of these tiff files, they all have the same problem --> https://github.com/Sicos1977/TesseractOCR/tree/master/TesseractOCR.Tests/Data/Conversion

zdenop commented 1 year ago

C code works for me without a problem:

#include <leptonica/allheaders.h>

int main(int argc, char **argv) {
  char *filein;
  PIX *pixs;
  l_ok ret;
  l_int32  LeptMsgSeverity = 1;

  if (argc != 2)
     return fprintf(stderr," Syntax:  %s filein", argv[0]);

  filein = argv[1];

  if ((pixs = pixRead(filein)) == NULL) {
    ERROR_INT("pix not made", __func__, 1);
  } else {
    fprintf(stderr, "Leptonica loaded: %s\n", filein);
  }

  ret = pixWrite("pixs.png", pixs, IFF_PNG);
  if (ret > 0)
    ERROR_INT("ret of pixWrite: ", __func__, ret);
  pixDestroy(&pixs);
  return ret;
}

compiled with (MSVC 2019, Windows 10 64bit; libtiff 4.4.0):

cl test_tiff.c -I..\include ..\lib\leptonica-1.83.0.lib

e.g. test_tiff.exe photo_palette_4bpp.tif creates the correct png output... Seems like a problem with wrapping leptonica in C# .NET...

Sicos1977 commented 1 year ago

Weird because 1.82 works without any problems. Is there a way to see if the tif library is compiled into the output file?

zdenop commented 1 year ago

Well, you should see an error message (in the console/terminal) if you try to open unsupported image format. Also, check the output of leptonica function getImagelibVersions.

You can also try to open an image with a format that does not need external lib (pbn, ppn, bmp) and save it as tiff...

vsolominov commented 1 year ago

Images are saved in tiff format correctly. Versions of built-in libraries in leptonica-1.83.0:

I think this is due to the fact that the image format (e. g. photo_rgb_32bpp.tif file) is not tiff but bitmap:

image

Looking at the commit d4ab740f6c0d40fe12ddc15c9ceba1f614711665 Leptonica does not support 32 bpp images.

Sicos1977 commented 1 year ago

I checked one of the files with a hex editor and that seems like a tiff to me... as far as I know the first few bytes of a tiff image are always II

image

49 49 2A 00 (little-endian) | II*␀ | 0 | tiftiff | Tagged Image File Format (TIFF)[10] -- | -- | -- | -- | --

49 49 2A 00 (little-endian) II*␀ 0 tif tiff Tagged Image File Format (TIFF)[10]

vsolominov commented 1 year ago

Yes, this is true for all files except what I wrote above (photo_rgb_32bpp.tif)

zdenop commented 1 year ago

When I run >test_tiff.exe photo_rgb_32bpp.tif I see this message:

Error in pixReadMemBmp: 32 bpp rgba input data is not supported
Error in pixReadStream: bmp: no pix returned
Error in pixRead: pix not read
Error in main: pix not made
Error in pixWrite: pix not defined
Sicos1977 commented 1 year ago

The only thing I find weird is that all files seem to work when using version 1.82. I have to dive deeper into this to see if I did something wrong but that will be next week because I'm busy with another project at the moment.

zdenop commented 1 year ago
  1. As already mentioned by vsolominov photo_rgb_32bpp.tif is BMP file with wrong extensions. The same message you can get with IrfanView
  2. leptonica 1.82 reads it - but when you run my tests code, the output in PNG does not have correct colors
  3. leptonica 1.83 refuse to load it - reason is explain in this commit https://github.com/DanBloomberg/leptonica/commit/d4ab740f6c0d40fe12ddc15c9ceba1f614711665
Sicos1977 commented 1 year ago
  1. As already mentioned by vsolominov photo_rgb_32bpp.tif is BMP file with wrong extensions. The same message you can get with IrfanView
  2. leptonica 1.82 reads it - but when you run my tests code, the output in PNG does not have correct colors
  3. leptonica 1.83 refuse to load it - reason is explain in this commit d4ab740

Okay, thinks missed that one.