Closed 0xbad1d3a5 closed 7 years ago
The crash is repeatable. Interestingly, It only surfaces when you are doing a capture of that specific area in ePSXe, but not if you are capturing a screenshot of the same image.
Edit - Crash happens on the following line of Ocr/ResultIterator.java/getChoicesAndConfidence(int): String[] nativeChoices = nativeGetChoices(mNativeResultIterator, level); mNativeResultIterator = 2222941984; level = 4;
I cannot find any documentation on that function, nor do I see where it is defined locally. As such my debugging for the most part has stopped here.
Note - The sound glitches for a second when the app crashes, could be related, could just be the phone getting screwy during a crash.
Crash seems to be caused due to the input screenshot containing the blue border of the captured area. OCR then is trying to figure out what is going on if the character is too near that edge. It would then treat the edge, and as such, the whole image as part of the kanji. The fix is simply to ensure the blue border is not part of the captured screenshot. (An offset of the border width is sufficient in testing)
Thanks for the investigation. I agree with cropping out the blue border and intended to do that later, but it looks like the actual cause of the bug is something else.
I cut out the exact box from your screenshot and ran a unit test on it:
@SmallTest
public void testImage(){
// Attempt to initialize the API.
final TessBaseAPI baseApi = new TessBaseAPI();
boolean success = baseApi.init(TESSBASE_PATH, "jpn");
assertTrue(success);
Bitmap bitmap = BitmapFactory.decodeFile(TESSBASE_PATH + "ocrfail.png");
baseApi.setImage(bitmap);
String hocr = baseApi.getHOCRText(0);
String text = baseApi.getUTF8Text();
ResultIterator resultIterator = baseApi.getResultIterator();
resultIterator.begin();
do {
List<Pair<String, Double>> choicesAndConfidence = resultIterator.getChoicesAndConfidence(PageIteratorLevel.RIL_SYMBOL);
} while (resultIterator.next(PageIteratorLevel.RIL_SYMBOL));
}
It looks like the problem stems from the fact that tesseract actually didn't detect any words in that capture, so my call to resultIterator.getChoicesAndConfidence(PageIteratorLevel.RIL_SYMBOL)
was invalid and threw an exception in Itrresultiterator.cpp
at line 341: ASSERT_HOST(result_it.it_->word() != NULL)
.
I assumed the .next()
function was to determine if the next word existed, but looks like it actually tests for end of file. So the proper fix would simply change the do-while loop into a while loop in OcrRunnable, and we shouldn't hit this bug again in the future if the OCR result ever comes back with no results.
Actually wait. It looks like .next()
goes to the next word and tests for EOF. Well, that kinda makes things a little more difficult...
Try fixing the border, I pulled your baseline and changed it myself and the crashes stopped.
I changed OCRRunnable.java - Line: 180 to: Bitmap croppedBitmap = Bitmap.createBitmap(bitmapOriginal, box.x+10, box.y+10, box.width-20, box.height-20);
It also seemed to greatly improve the accuracy of certain kanji.
You are right that the ACTUAL crash is due to it not finding any kanji. This can be repeated by uploading a perfectly black or white image. But the reason in this case it was not finding any was because the border is being treated as kanji as well.
Right, I'm aware that fixing the border would probably fix this instance of the OCR, but the root cause was that the OCR Engine (tesseract) was not able to detect any characters in the box (for whatever reason). We can still hit this bug in the future despite cropping the border out if in the future tesseract again wasn't able to find any words.
I think I just fixed it, can you try commit 221a05b3349530d213760a8b15952f10aba95ef2 and see if the issue still exists?
You are right that the ACTUAL crash is due to it not finding any kanji. This can be repeated by uploading a perfectly black or white image. But the reason in this case it was not finding any was because the border is being treated as kanji as well.
Ah. I see. Yes correct, that should be fixed as well. I was treating them as two separate issues. One being tesseract crashing on no words detected and the other to crop the image.
Would you like to submit a pull request for cropping the box (Issue #5)? One thing though, the border width is defined by drawable/border-transparent.xml
and is set to 1dp. You shouldn't be using absolute pixel values (+10, +20, etc) in the createBitmap()
function. There's a helper class called KakuTools
that will convert dp to px for you, so please use that.
If you don't want to submit a pull request, that's fine too. I'll fix it later :)
I have no problems offering assistance here and there, but I generally don't like pull requests unless the owner gives the ok before hand. Some people would prefer that it stays their own code. As such, I did not do anything this time via Git... so I can't actually do the request anyways.
As for the hardcoding... I just wanted some values in there to prove that it worked, doing it correct falls under what I said above. (Why do it right if you aren't going to use it)
I can verify that the change you made prevents the crash. However if no kanji is detected... perhaps a toast saying that would be nice. (It is a bit unnerving not being sure if it is calculating or already failed)
When you address #5, it will allow kanji to be detected in the image I provided, but it will pick up background noise as various things. If you wish, I have some temp code that will do some pre-processing of the image to clean it up quite a bit... but ideally I think you would want to use something other than the horrible algorithms I pieced together.
So... if you want me to do formal pull requests I can do that... your call.
I'd love more help on the project, so if you're available and can help fix bugs, then by all means. Just make sure there's a issue made for each separate issue so discussions can happen beforehand before any code changes.
Do note that the project license is BSD-3 and all contributed code would fall under this license.
This specific issue (Kaku crashes when there are no OCR'd characters) is fixed in 221a05b3349530d213760a8b15952f10aba95ef2. The other issues discussed here are being tracked in #5.
LOG:
Screenshot:![screenshot_20170113-152156](https://cloud.githubusercontent.com/assets/878158/21957933/ede0cec6-da56-11e6-937a-a507cedd03fd.png)