Audiveris / audiveris

Latest generation of Audiveris OMR engine
https://audiveris.github.io/audiveris
GNU Affero General Public License v3.0
1.62k stars 236 forks source link

converting misidentified lyrics to normal text #633

Closed phdm closed 1 year ago

phdm commented 1 year ago

Hello,

How can we convert text misidentified as lyrics to other type of text (like 'rights') without deleting them and retyping them ?

Bacchushlg commented 1 year ago

select the miss-interpreted text using shift-select (usually the selection rectangle can be a bit greater than the text - you see a smaller rectangle inside with the really selected items). press "delete" and then select the proper text type (I usually use the short cuts: p, l for lyrics, p, t for all other text)

phdm commented 1 year ago

Thank you, it looks like it would work if I had installed the tesseract data. Can you clarify the instruction to install the tesseract data to work with the development branch ?

hbitteur commented 1 year ago

If I remember correctly, you don't have to delete and then retype text. Simply select the sentence and modify its role in the InterBoard. When choosing "Lyrics", the program will automatically convert your SentenceInter/WordInter's to LyricLineInter/LyricItemInter's.

Mind the fact that you have to select the sentence, not the words, since role is an attribute of sentence not of word. The easiest way to select a sentence is to select one of its words, and in the InterBoard click on button "ToEnsemble".

hbitteur commented 1 year ago

Thank you, it looks like it would work if I had installed the tesseract data. Can you clarify the instruction to install the tesseract data to work with the development branch ?

Please tell us a bit more about your environment (OS, installer or building from sources) and, if you build from sources, which branch you take (master vs development). For the former you'd need language data files of Tesseract 3.04, for the latter Tesseract 4.x.

FYI, in Audiveris handbook, this section is being rewritten in order to clarify this rather touchy topic

phdm commented 1 year ago

I work with the development branch on windows 10 (64 bit) , And I wanted to convert words recognized as lyrics to text ('rights'), not the other way around.

hbitteur commented 1 year ago

OK, I think you are right. When going from lyrics to standard text, the problem is to select a sub-sentence of lyric words, and there is currently no way to do that.

But you don't have to manually retype the words content:

  1. Select the lyrics line and deassign it (this will delete the member words).
  2. Select the words that should compose the new sentence (for example using a lasso).
  3. Double-click the text button in the ShapeBoard. This will call OCR on just the selected sequence of words, retrieve their content, and build the containing sentence on-the-fly.
  4. This newly built sentence has been automatically selected. Simply, use the Role selector to choose your target role for this sentence.
hbitteur commented 1 year ago

FYI, in Audiveris handbook, this section is being rewritten in order to clarify this rather touchy topic

Here below is an excerpt of the to-be-released documentation about adding OCR languages. It applies to your case since you are using the development version.

Adding OCR languages

Audiveris delegates all OCR operations to Tesseract library.

Whether you install Audiveris via its Windows installer or download the project and build it locally from source, you will automatically get provided with 4 Tesseract language files:

At any time, you can later download additional languages from the dedicated Tesseract page which contains data files for a hundred languages.

{: .important } There are recurrent messages on Audiveris issues forum about Tesseract configuration. So let's try to clarify the topic, using current versions of both Audiveris and Tesseract.

Tesseract as a linked library

Audiveris calls Tesseract software as a linked binary library, not as a separate executable program.

Data version

Tesseract provides two different engines, the legacy engine and the new LSTM-based engine, each with its own model.

The language files downloadable from Tesseract page are meant for Tesseract version 4.x, each language file containing both legacy and new models.

Since Audiveris is now linked with such 4.x Tesseract library, operated in legacy engine mode, these language files are thus compatible with Audiveris {{ site.audiveris_version }}.

Data location

At starting time, Audiveris initializes Tesseract library with a tessdata folder:

  1. It first checks the location defined by TESSDATA_PREFIX environment variable.
  2. If not found there, it tries Tesseract tessdata default location according to the OS, which for Windows 64-bit means "C:\Program Files (x86)\tesseract-ocr\tessdata".

If in doubt, we recommend to define TESSDATA_PREFIX variable accordingly.

{: .warning } Regarding TESSDATA_PREFIX variable:

As an example, a typical configuration (under Windows 10 64-bit) can be as follows:

$ echo $TESSDATA_PREFIX
C:\Program Files (x86)\tesseract-ocr\tessdata
$ tree "$TESSDATA_PREFIX"
C:\Program Files (x86)\tesseract-ocr\tessdata
├── configs
│   ├── ...
├── deu.traineddata
...
├── eng.traineddata
...
├── fra.traineddata
...
├── ita.traineddata
...
└── tessconfigs
    ├── ...

2 directories, 57 files

Languages selection

At runtime, you can specify which languages should be tried by the OCR software.

The easiest way is to define the language specification interactively.
Using the Book | Set Book Parameters menu, you can make specifications at global level, book level and even individual sheet level. Depending upon the language files present in your local tessdata folder, you will presented the list of languages available for selection.

The default (global) specification is determined by the application constant org.audiveris.omr.text.Language.defaultSpecification, whose initial value is deu+eng+fra.
Hence, you can also modify this default directly by changing the constant value:

phdm commented 1 year ago

Thank you.

The new explanation theTESSDATA_PREFIX environment variable is perfectly clear. I managed to set it to use in development environment to the location of the data that were installed by the 5.2 release installer, i.e. export TESSDATA_PREFIX='C:\Program Files (x86)\tesseract-ocr\tessdata'

Without it, those data were not found by the development branch, although the TESSDATE_PREFIX was not set.

If that matters, PROGRAMFILES='C:\Program Files', ProgramData='C:\ProgramData' and ProgramW6432='C:\Program Files'

hbitteur commented 1 year ago

No further developments on this issue. I'm closing it.