DanBloomberg / leptonica

Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The official github repository for Leptonica is: danbloomberg/leptonica. See leptonica.org for more documentation.
Other
1.74k stars 387 forks source link

pixWriteMemJpeg #653

Open bkimman opened 1 year ago

bkimman commented 1 year ago

The function pixWriteMemJpeg uses open_memstream on systems where this available. On other system, the function writes to a temporary file and then reads that in.

libjpeg natively provides a 'memory' destination and 'memory' source. The libjpeg.txt file in the libjpeg source code provides details of this.

For compression, it provides jpeg_mem_dest which takes two arguments .. unsigned char * outbuffer and unsigned long outsize

The 2nd argument's data type is different from the one provided in pixWriteMemJpeg which is size_t.

I tried out the compression routine and it works fine .. I copied the existing pixWriteStreamJpeg function, modified the arguments to receive the pdata and psize; and replaced the jpeg_stdio_dest with jpeg_mem_dest.

If you could let me know how you would like this to be implemented, I could do that and submit it.

Thanks K

DanBloomberg commented 1 year ago

Thank you for offering.

At present, all recent versions of llnux, android, macOs and ios use open_memstream. Only windows does not. I am aware that the tiff library has memory operations for compression and decompression. I checked and these seem to be available from 8.0 onward. Here are the issues as I see them:

(1) We still have to support older versions of libtiff hat don't have these functions. (2) We don't want to have two large functions for reading jpeg to raster (one from stream, one from memory), and ditto for writing. Basically, we don't want to replicate a lot of code.

With respect to (2), as currently set up, the memory functions are shims that call the stream functions, so all the details are in the stream functions. I am having trouble visualizing how we can satisfy (1) and (2).

bkimman commented 1 year ago

Hi Dan

Appreciate the challenges you mention.

Could we define two structures like this: struct jpeg_read_io_options { int kind; // File stream or Memory FILE fp; unsigned char buffer; size_t buffer_size; }; struct jpeg_write_io_options { int kind; FILE *fp; unsigned char *buffer; size_t buffer_size; }

I chose not to use a union but I guess that is option as well if you prefer that.

We move the main function with all the code to static functions which accept this option structure .. and based on the 'kind' element, we call the appropriate jpeg_dest or jpeg_src functions.

The current 'stream' entry points could prepare this structure and call the new internal function.

The current 'mem' entry points ..

where open_memstream is available, it creates one and then sets up the option structure accordingly. where open_memstream is not available, we check the version of the jpeg library - if it supports a memory source/destination, we prepare the structure accordingly. Otherwise, we fall back on the current implementation of opening a temporary file and set up the structure accordingly.

K

DanBloomberg commented 1 year ago

Yes, that works because the file handle in the current stream interfaces are only used to call the appropriate jpeg_*_src and jpeg_*_dest functions.

A data structure is not needed, because the two new static functions would have all 3 args for the stream and memory. Don't need the selector kind because we can infer by using null pointers for the arg that is not involved.

We currently have 4 ways this issue is handled in the library:

This is the fifth:

The changes are a bit involved. I'll give this a try in the next week or so.

Dan