Audiveris / audiveris

Latest generation of Audiveris OMR engine
https://audiveris.github.io/audiveris
GNU Affero General Public License v3.0
1.53k stars 227 forks source link

java.lang.ArrayIndexOutOfBoundsException: Index 18479318 out of bounds for length 18479076 #711

Open csa8280 opened 9 months ago

csa8280 commented 9 months ago
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: Index 18479318 out of bounds for length 18479076
        at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205)
        at org.audiveris.omr.sheet.SheetStub.doOneStep(SheetStub.java:570)
        at org.audiveris.omr.sheet.SheetStub.reachStep(SheetStub.java:1367)
        at org.audiveris.omr.sheet.Book.reachBookStep(Book.java:2023)
        at org.audiveris.omr.sheet.Book.transcribe(Book.java:2577)
        at org.audiveris.omr.sheet.ui.BookActions$TranscribeBookTask.doInBackground(BookActions.java:2737)
        at org.audiveris.omr.sheet.ui.BookActions$TranscribeBookTask.doInBackground(BookActions.java:2708)
        at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:304)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:343)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.RuntimeException: java.lang.ArrayIndexOutOfBoundsException: Index 18479318 out of bounds for length 18479076
        at org.audiveris.omr.step.AbstractSystemStep.doitPerSystem(AbstractSystemStep.java:186)
        at org.audiveris.omr.step.AbstractSystemStep.doit(AbstractSystemStep.java:125)
        at org.audiveris.omr.step.OmrStep.doit(OmrStep.java:138)
        at org.audiveris.omr.sheet.SheetStub.lambda$doOneStep$2(SheetStub.java:555)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        ... 3 common frames omitted
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 18479318 out of bounds for length 18479076
        at ij.process.ByteProcessor.get(ByteProcessor.java:251)
        at org.audiveris.omr.sheet.key.KeyExtractor.getProjection(KeyExtractor.java:235)
        at org.audiveris.omr.sheet.key.KeyBuilder.<init>(KeyBuilder.java:230)
        at org.audiveris.omr.sheet.key.KeyColumn.retrieveKeys(KeyColumn.java:421)
        at org.audiveris.omr.sheet.header.HeaderBuilder.processHeader(HeaderBuilder.java:266)
        at org.audiveris.omr.sheet.header.HeadersStep.doSystem(HeadersStep.java:63)
        at org.audiveris.omr.sheet.header.HeadersStep.doSystem(HeadersStep.java:37)
        at org.audiveris.omr.step.AbstractSystemStep.lambda$doitPerSystem$0(AbstractSystemStep.java:159)
        at org.audiveris.omr.step.AbstractSystemStep.doitPerSystem(AbstractSystemStep.java:179)
        ... 7 common frames omitted

score Img: 123

hbitteur commented 9 months ago

@csa8280 There is no margin below the last staff.

In the HEADERS step, the engine searches for key elements in a buffer that goes from 2 spaces above the staff to 1 space below. In your case, that means looking beyond the confines of the image. Hence the exception.

I will try to ruggerdize the engine code to deal with such situations. (The same problem will occur in the HEADS step, when searching for note heads just below the last staff line...)

For now, the best "solution" would be to put some margin around your score image.

hbitteur commented 8 months ago

Bug fixed today by commit 94370470d43badc5d0af8196f406130fafd03759 on "development" branch.

Note however that Tesseract chokes on this input image (error code: -1), hence no OCR result is available here. I could not find a way to make it work, even by adding substantial white margins around the buffer submitted to Tesseract.

Perhaps because the input image is awfully long: about 40_000 pixels wide by 462 pixels high. My workaround: I scaled down the input image by half. This both reduced the transcription duration significantly and made Tesseract work.

wlodr commented 6 months ago

the error is painful and it is still there. adding some margins helped. (the flatpak version of audiveris downloaded yesterday)

hbitteur commented 6 months ago

@wlodr The flatpak version was published end of 2023, IIRC. Flatpak version is still in beta, and is meant to be aligned with standard releases. In other words, the bugfix (pushed on development branch on January 23) is more recent than the Flatpak content.