tvn-cosine / tesseract.net

a .net wrapper for Tesseract
GNU General Public License v3.0
24 stars 13 forks source link

How do I convert Bitmap to Pix? #10

Closed whatohyou closed 6 years ago

whatohyou commented 6 years ago

Long story short, I capture the screen as bitmap then read some pixels and cut & paste some pixels. Then I want to run it on tvn-cosine ocr. However since tvn-cosine tesseract will only accept leptonica.pix, I'll have to save the bitmap to a tiff image file in the harddisk drive before using api.setimage(temp.tif); ... It isn't very efficient.

MemoryStream byteStream = new MemoryStream(); Bmp.Save(byteStream, System.Drawing.Imaging.ImageFormat.Tiff); using (TessBaseAPI api = new TessBaseAPI(dataPath, language, OcrEngineMode.DEFAULT, PageSegmentationMode.AUTO)) { api.SetImage(byteStream);//it will not work. It says you can't convert memory stream to pix. }

-- People were having similar problems as follow: https://github.com/charlesw/tesseract/issues/61 charlesw made pixconverter function that converts bitmap to pix. When I tried it, it says tesseract.pix is not leptonica.pix ...

Is there any solutions? How do I convert bitmap to leptonica.pix in the memory?

whatohyou commented 6 years ago

http://tpgit.github.io/Leptonica/bmpio_8c_source.html#l00057 It seems there is source code to read bmp memory stream to Leptonica.Pix but I don't know how to use it in c#.

tvn-cosine commented 6 years ago

We have not created in memory examples in wiki yet. Please use this static class:

https://github.com/tvn-cosine/leptonica.net/blob/master/leptonica.net/Implementations/BmpIO.cs

whatohyou commented 6 years ago

Pix pix = BmpIO.pixReadMemBmp(Bmp, Bmp.Size);

Doesn't seem to work. How should I do it?

Read/write to memory [only on linux] Does it only work in linux? Is there a windows version?

tvn-cosine commented 6 years ago

We are adding an item for bitmap support extension methods.

tr4nquility commented 6 years ago

Hi @tvn-cosine , Any update on this enhancement? I also need this functionality of TessBaseAPI reading an image, either from object or memory stream. Perhaps you can show us how to use BmpIO.pixReadMemBmp() method in the mean time? Thank you.

CSharpReaper commented 6 years ago

Hi, you can use it in this way for a stream.

var image = Image.FromStream(stream); var ms = new MemoryStream(); image.Save(ms, ImageFormat.Bmp) var bytes = ms.ToArray(); Pix pix; fixed(byte* ptr = bytes) { pix = BmpIO.pixReadMemBmp((IntPtr)ptr, (IntPtr)bytes.Length) }

Note: Make sure your project allows unsave code!

tvn-cosine commented 6 years ago

@CSharpReaper , Thanks for code. This was implemented in the latest releases of the leptonica nuget packages.

var pix = Pix1.pixReadFromMemoryStream(System.IO.Stream stream);

https://www.nuget.org/packages/leptonica.net/. https://www.nuget.org/packages/tesseract.net/.