Audiveris / audiveris

Latest generation of Audiveris OMR engine
https://audiveris.github.io/audiveris
GNU Affero General Public License v3.0
1.47k stars 218 forks source link

High quality pdf file cannot be processed properly #162

Open FatihEbibelot opened 6 years ago

FatihEbibelot commented 6 years ago

Hi all, I'm playing with Audiveris desktop version and I have the latest development branch build. The problem is that I have this high quality pdf file; behr_f_french_childs_song_piano_beg.pdf and when I try to process it with Audiveris, Audiveris cannot produce proper gray scaled image or produce proper mxl file. what I am looking for is a solution that works for high quality PDF files. Could you explain, what the problem could be and why PDF files are not converted properly. How can we fix this problem. Any suggestions?

What I see is these images;

screen shot 2018-05-15 at 12 32 11 pm screen shot 2018-05-15 at 12 32 24 pm

And the log in the terminal is this one;

screen shot 2018-05-15 at 12 32 35 pm

And this is my system properties;

screen shot 2018-05-15 at 12 37 32 pm

Properties; Latest commit in the build: a36ace13d2ffd5e3f5223b50be6125eafb7ba260 OS: MAC OS 10.13.4

FatihEbibelot commented 6 years ago

Also Audiveris can process this PDF file very good and very fast. I don't know if there are different versions of PDF files which makes no sense to me. Here is the file;

maconaria.pdf

JalonSolov commented 6 years ago

I can confirm the odd handling of the first PDF, using latest development source, on Windows 10.

The 2 PDFs in this thread are definitely different versions.

Looking at the document properties, maconaria.pdf is PDF v1.5, built by "doPDF".

behr_f_french_childs_song_piano_beg.pdf, on the other hand, is PDF 1.2, built by "Acrobat Distiller 5.0 (Windows)".

What difference does that make? I don't know. Just data, I suppose.

I can tell you, however, that a program I have that extracts images from PDF files cannot extract anything from either of these PDFs.


D:\GoLang\Projects>bin\pdfcpu extract -mode image \\Storage\DL\behr_f_french_childs_song_piano_beg.pdf foo
extracting images from \\Storage\DL\behr_f_french_childs_song_piano_beg.pdf into foo ...
validateNumberEntry: dict=extGStateDict entry=SM unsupported in version 1.2

D:\GoLang\Projects>bin\pdfcpu extract -mode image \\Storage\DL\maconaria.pdf foo
extracting images from \\Storage\DL\maconaria.pdf into foo ...
INFO: 2018/05/15 19:26:22 api.go:461: pagesForPageSelection: pageSelection is nil
INFO: 2018/05/15 19:26:22 images.go:129: No image info available.

D:\GoLang\Projects>

They are, however, easily read by PDF readers (including the one built into Edge and Firefox browsers).

Whatever the issue with Audiveris may turn out to be, it does appear it may be related to how these PDFs were created.

hbitteur commented 6 years ago

I have just tried the 2 examples:

Regarding the initial topic, PDF reading is performed in class ImageLoading which, based on the ".pdf" extension delegates to JPod java library. The core is done in JPodLoader (in class ImageLoading, starting at line 382). I have no clue about what is happening, let's ask @maximumspatium since he wrote these lines a long time ago.

maximumspatium commented 6 years ago

Please give me some time to investigate why behr_f_french_childs_song_piano_beg.pdf cannot be properly imported in audiveris.

hbitteur commented 6 years ago

Regarding the "too perfect" score, this is one more problem brought by synthetic scores: their staff line is so thin (1 pixel here) that thresholds based on line thickness get set to a too low value. On real scores, a staff line is never 1 pixel thick!

Here the fix was to modify the peak detection in histogram of vertical runs; the minimum bucket value, that was based on staff line fraction (2.0), is now based on a staff interline fraction (0.25) Done in commit 4d23ff9cf6e2787221bffbfeef43859f71872b60

maximumspatium commented 6 years ago

Regarding the "too perfect" score, this is one more problem brought by synthetic scores: their staff line is so thin (1 pixel here) that thresholds based on line thickness get set to a too low value.

Yes.

Another problem with thin staff lines (and stems) is that they disappear when zooming out the score view. The binary view looks then rather ugly:

too thin staff lines and stems

hbitteur commented 6 years ago

Is there a way to avoid this disappearing display of thin lines? The binary view is a BufferedImage drawn on the screen. Since, by definition, the binary view is strictly black & white, I'm afraid we can't play with antialiasing.

hbitteur commented 6 years ago

As opposed to the binary view, the data view is not an image but the result of painting the various items (sections, lines, stems, etc) programmatically. This is the reason why thin lines don't disappear on the data tab.

FatihEbibelot commented 6 years ago

Is there any updates on this issue?

GaryH4 commented 4 years ago

Facing the same problem. Any updates?

maximumspatium commented 4 years ago

Facing the same problem. Any updates?

Can you upload the problematic file here? Github does accept ZIP archives so it's perhaps the most convenient way to share files...

GaryH4 commented 4 years ago

For emample, this file: (downloaded from musescore.com) Numb_-_Linkin_Park.pdf
When I'm using PDF reader like Chrome, zoom in has no problem. But when imported to Audiveris, binary file seems very pixel.(Windows 10, latest Audiveris stable build) image Then after ocr, some notes cannot be recongnized. image image

And this: image image Semms that triple f is not supported?

GaryH4 commented 4 years ago

Also, the speed note cannot be recognized: image image

maximumspatium commented 4 years ago

@GaryH4 This issue is about PDF importing issues. Your PDF is okay because Audiveris can import it without problems.

The most recognition errors spotted by you are originated from the fact that your score doesn't conform to the music engraving standard which prescribes that chords belonging to distinct staves shouldn't stuck together. While a trained human musician is capable of fixing it (i.e. separating the stuck notes) on the fly, an AI algorithm will usually fail. The current Audiveris GUI doesn't offer an easy possibility to fix that manually...

The symbol dynamicFFF isn't indeed supported yet but that isn't a big issue IMHO.

Recognition of tempo indications that mixes musical symbols and plain text is indeed problematic because Tesseract OCR that is responsible for text recognition couldn't have been trained to recognize musical symbols until now. It's therefore a long-standing issue we're waiting to be resolved since years. Fortunately, it can be quickly fixed manually in Musescore, for example...

Please keep in mind that error-free optical music recognition is impossible. Trying to achieve it would be an exercise in futility. There will be always errors requiring human-assisted verification and correction. Our goal is to reduce the amount of errors and, consequently, the amount of manual corrections...

GaryH4 commented 4 years ago

Thanks for the quick and detailed response.

I'm not music-related but a CS student. Though I'm intersted in music and know a little about score sheets.

The project i'm working on is to create a guitar robot that can play (mostly) whatever music the user given(mostly by pdf). Usually music robot need to manually program one or some pieces of music IMHO.

Since we need to do some manul fixings after ocr, I cannot just leave a web interface for letting users to upload their pdf file and instantly play...

maximumspatium commented 4 years ago

Since we need to do some manully fixing after ocr, I cannot just leave a web interface for letting users to upload their pdf file and instantly play...

I see. You can directly export and use raw recognition results from Audiveris without performing error corrections but your mileage may vary, of course. For high-quality, crisp scores with common symbols the result should be good...

GaryH4 commented 4 years ago

Got it. Thanks a lot.

maximumspatium commented 4 years ago

@GaryH4 Consider running Audiveris in the batch mode: https://github.com/Audiveris/audiveris/wiki/Launching#passing-application-arguments https://github.com/Audiveris/audiveris/wiki/CLI-Arguments