Closed GoogleCodeExporter closed 9 years ago
a) what platform is this, Linux?
b) is this streaming to stdout, e.g. tesseract input.tif - pdf > output.pdf
c) if yes, do you also get it with other formats, e.g. tesseract input.tif -
hocr > output.hocr
It's quite possible we can "fix" this by closing the stdout stream when
we finish writing. This will have the benefit of making it impossible
for someone to accidentally stream multiple output formats to stdout
and cause silent data corruption.
Not sure where the code is, let me spend a couple minutes checking.
PS. Just curious, how did you even notice this?
Original comment by breidenb...@gmail.com
on 28 Jul 2015 at 7:01
Yeah, it's right here.
https://github.com/tesseract-ocr/tesseract/blob/master/api/renderer.cpp#L33
The original idea of not closing stdout after we finish with it was
introduced by Zdenko back in Dec 23, 2012. I don't know why. Zdenko,
do you remember what you were thinking about?
https://github.com/tesseract-ocr/tesseract/commit/4812fac33e25f0b384d473b597e935
08725ce058
Original comment by breidenb...@gmail.com
on 28 Jul 2015 at 7:17
IMO reporter does not use stdout ("On a different console, monitor the
result")...
Regarding closing stdout - AFAIK if we perform fclose(stdout) - (especially
outside of main) it will cause program will not be able to write to stdout
(e.g. warning, some info) and program will crash. So fclose(stdout) is not
considered as wise action.
Original comment by zde...@gmail.com
on 29 Jul 2015 at 6:35
a) I tried this on Linux. Originally I found this on Windows with tess4j java
wrapper. But in order to confirm that the problem is not on the wrapper, I
tried it on Linux.
b) No I am streaming to a file. Here is my command:
tesseract tesseract-3.04.00/testing/eurotext.png result --tessdata-dir
tesseract-ocr -c tessedit_create_pdf=true
It produces file result.pdf
c) I am only interested in pdf, so I have not tried other formats.
I have an application where the user can work on several images. We want to
provide the ability to create a searchable pdf from an image. The problem
becomes obvious because the user creates one pdf and if he tries to open it, it
fails. The produced pdf can be correctly opened only when the user closes the
application.
I now see that the problem is that the renderer's destructor is called when the
main function is about to return.
Original comment by gpapadop73
on 29 Jul 2015 at 7:09
The problem happened because in the java code there was no call to
TessDeleteResultRenderer. But the tricky part was that adding this call did not
solve the problem. The reason was the
delete[] renderer;
instead of
delete renderer;
which you fixed in file api/capi.cpp
So after getting your source from label 3.04.01dev and fixing the java wrapper,
it works fine.
Thank you
Original comment by gpapadop73
on 29 Jul 2015 at 2:00
Original comment by zde...@gmail.com
on 29 Jul 2015 at 4:01
Regarding comment #4, I sure hope warnings or info go to stderr, not stdout.
But since gpapadop73 is happy, then I am too.
Original comment by breidenb...@gmail.com
on 29 Jul 2015 at 9:56
Original issue reported on code.google.com by
gpapadop73
on 27 Jul 2015 at 10:15