Audiveris / omr-dataset-tools

Reference of OMR data
GNU Affero General Public License v3.0
18 stars 5 forks source link

Annotations for upper/lower halves of a time signature #28

Closed hbitteur closed 6 years ago

hbitteur commented 7 years ago

Except the whole time signatures (timeSigCommon and timeSigCutCommon), all the time signatures are composed of numerator and denominator numbers.

Would it be possible to get:

Nota: I found a timeSig26over8, but I suspect a typo... :-)

lasconic commented 6 years ago

timeSig26over8 is definitely possible. A bit like with tuplets, we will need to define what to do with elements that MuseScore supports and what Audiveris or other OMR can handle. Do we limit the number of supported time signatures and if yes, which limit ? or do we support an infinity of "shape" ids ?

Regarding the nested symbols, you mean something like this ?

<Symbol interline="20" shape="timeSig9over8">
      <Symbol interline="20" shape="timeSig9" />
      <Symbol interline="20" shape="timeSig8" />
</Symbol> 

What about 12/8 ?

<Symbol interline="20" shape="timeSig12over8">
      <Symbol interline="20" shape="timeSig1" />
      <Symbol interline="20" shape="timeSig2" />
      <Symbol interline="20" shape="timeSig8" />
</Symbol> 

or

<Symbol interline="20" shape="timeSig12over8">
      <Symbol interline="20" shape="timeSig12" />
      <Symbol interline="20" shape="timeSig8" />
</Symbol> 
nasehim7 commented 6 years ago

Any update on this? :)

nasehim7 commented 6 years ago

Just a thought: What if to get the bounds of the nested element we do: For example:

  <Symbol interline="20" shape="timeSig4over4">
    <Bounds x="99.868" y="226.590" w="16.094" h="40.137"/>
    <Symbol interline="20" shape="4">
      </Symbol>
    <Symbol interline="20" shape="4">
      </Symbol>
    </Symbol>

To get the bounds for each 4 shape, we apply the math like: For 1st Symbol 4: x and y would be same as the outer element i.e. 99.868 and 226.590 respectively. Width would also be same but the height would be (40.137 / 2) = 20.068. For 2nd Symbol 4: x, width and height would be same like before, i.e. 99.868, 16.094 and 20.068 respectively. For y, It will be 226.590 - 20.068 (which is half the height) = 206.522.

lasconic commented 6 years ago

Try with 12/8 to see why it doesn't work.

nasehim7 commented 6 years ago
screen shot 2018-05-07 at 7 52 01 pm

Actually, I was thinking on this 12/8 case earlier also and thought that even if the numerator is a two digit number still the outer rectangle red can be divided into this green and yellow boxes which will contain the numbers. What am I missing here? :)

lasconic commented 6 years ago

The 8 box should be smaller in width.

nasehim7 commented 6 years ago

Actually I thought it is not a crucial requirement only proper segmentation is hence this logic but as it is, I will look into it. :)

lasconic commented 6 years ago

I guess OMR will be happier with the smallest bounding box possible, @hbitteur ?

nasehim7 commented 6 years ago

Update: I have implemented Bigger box for 12 and smaller for 8. I have tested it on other TimeSig values too. For reference, putting a snippet from one generated XML here:

  <Symbol interline="20" shape="timeSig12over8">
    <Bounds x="99.868" y="226.515" w="27.380" h="40.178"/>
    <Symbol interline="20" shape="12">
      <Bounds x="108.082" y="214.473" w="10.952" h="8.047"/>
      </Symbol>
    <Symbol interline="20" shape="8">
      <Bounds x="110.570" y="206.426" w="5.976" h="8.000"/>
      </Symbol>
    </Symbol>

Algorithm for this approach: We have Outer Element and Inner two elements - Num and Denom So midHorizontal = (2 outerElement.x() + outerElement.width()) Similarly, midVertical = (2 outerElement.y() - outerElement.height())

Now the Num and denom coordinates bounds - numX = midHorizontal - (num.width() / 2) numY = midVertical + num.height()

denomX = midHorizontal - (denom.width() / 2) denomY = midVertical

In case we want a change, I will do it accordingly. Let's see which one suits the OMR well. :)

hbitteur commented 6 years ago

This data is meant to become an OMR reference, and nobody can't guess all the future uses of it. So it would be a bad decision to jeopardize this reference data just to save a little bit of coding today! Stated differently, let's not cheat on bounding boxes!

The bounding box of a given entity is the smallest rectangle that contains this entity. In your precise case, I don't think that the "12" glyph can exhibit the same width as the "8" glyph.

After a careful look at your XML snippet, I have a few questions:

Concerning the algorithm, I would rewrite it as follows:

midHorizontal = outerElement.x() + outerElement.width()/2; // /2
midVertical = outerElement.y() + outerElement.height()/2; // + and /2

numX = midHorizontal - (num.width() / 2);
numY = outerElement.y(); // top of num = top of outer

denomX = midHorizontal - (denom.width() / 2);
denomY = midVertical;
nasehim7 commented 6 years ago

Regarding the algorithm, Sorry It was my bad I forgot to mention / 2 here. In the implementation, I used the midpoint technique which you mentioned above but unfortunately, missed it here. Thanks for the review. :)

Exploring more on point 4, this means in a typical five-line staff, the distance between the 1st line and the 5th line should be nearly 16? @hbitteur @lasconic

hbitteur commented 6 years ago

Distance between 1st and 5th line it typically "interline value" * 4 Here, according to your XML snippet, interline = 20, so 1st-5th distance is 80. I don't see where the "16" value comes from.

nasehim7 commented 6 years ago

I said because the height of 12 and 8 glyph combined gives me the value 16.047 and that is basically the distance between 1st and 5th line. As discussed, there are no visible gaps - the outer element width and height should not be this much. :) I have corrected the algorithm accordingly for y values to be in line with the convention. Working on to get this work correctly. :)

nasehim7 commented 6 years ago

I think I got my mistake. For reference, providing here the new snippet after changing the implementation.

  <Symbol interline="20" shape="timeSig12over8">
    <Bounds x="99.868" y="226.515" w="27.380" h="40.178"/>
    <Symbol interline="20" shape="12">
      <Bounds x="99.868" y="226.515" w="27.380" h="20.119"/>
      </Symbol>
    <Symbol interline="20" shape="8">
      <Bounds x="106.087" y="246.604" w="14.941" h="20.000"/>
      </Symbol>
    </Symbol>

Hope it's fine this time because all the calculations seems to look accurate. Up for review @hbitteur @lasconic. :D

lasconic commented 6 years ago

I believe it would be very useful if we had a tool that can ingest an image and the XML description and draw bounding boxes and x,y origin point for each box in order to test the implementation quickly. Can Audiveris do that already or should @nasehim7 quickly do something in C++ or any other language (JS could enable this in a browser... ?).

hbitteur commented 6 years ago

Drawing symbols bounding boxes on top of background image is already available in this project. We call them "control images" as described in this wiki section precisely because it allows a visual control of the boxes WRT underlying symbols.

To operate this feature, launch the Java program org.audiveris.omrdataset.Main with these arguments: -controls -- your-annotations-file.xml (Assuming the related image (your-image.png) is located in the same directory as your-annotations-file.xml, and properly referenced from within the .xml file)

For help on available arguments, launch the program with the -help argument.

Update: I just fixed the typo in wiki link

hbitteur commented 6 years ago

To launch the program, you can more easily use this command line (from the directory where you cloned omrdataset):

$> gradle run -PcmdLineArgs="-controls,--,my-annotation-file.xml"

Mind the fact that the argument separator is the comma character, not the space character

lasconic commented 6 years ago

Sorry, I missed that. @nasehim7 looks like you have something to test your files !

nasehim7 commented 6 years ago

Thanks, @hbitteur @lasconic I will test it and let you know the results :D

nasehim7 commented 6 years ago

When I run -

gradle run -PcmdLineArgs="-output,data/output,-controls,--,data/input-images/mops-1.xml"

I am getting this -

> Configure project :
targetOS=macosx-x86_64

> Task :run
INFO  []                       CLI 108  | CLI args: [-output, data/output, -controls, --, data/input-images/mops-1.xml]

Deprecated Gradle features were used in this build, making it incompatible with Gradle 5.0.
See https://docs.gradle.org/4.7/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 6s
3 actionable tasks: 1 executed, 2 up-to-date

Am I missing something? :)

hbitteur commented 6 years ago

My gradle version is 4.0, and I don't know which incompatibilities gradle 5.0 might bring. Sorry, I have no time to investigate this right now. I'm running behind for Audiveris 5.1 and 6.0 As a workaround, you could use gradle 4.0. Sorry for that.

nasehim7 commented 6 years ago

No Problem @hbitteur. Got it working with your solution :) For the time signature, the output looks fine. For reference:

screen shot 2018-05-12 at 10 40 30 pm

I need to work on the Tuplets because the boxes for right and left brackets are little out of place but for the number it's fine. :) @lasconic

lasconic commented 6 years ago

After second though, I would export all timesig, and let audiveris or any other consumer deals with the one they can handle.

nasehim7 commented 6 years ago

The Time Signature Implementation is generalized i.e. works for all the timesig MuseScore has. If there is any test score that you can help me with @lasconic, we can surely test it on that. :)

hbitteur commented 6 years ago

Sure, export both the "whole" timesig (such as 12/8) and its nested numbers (12 and 8). The reading program will pick up what it is interested in.

nasehim7 commented 6 years ago

Yes, the current Implementation does that @hbitteur :D

nasehim7 commented 6 years ago

@lasconic This should also be okay in the next dataset. :)