rism-digital / verovio

🎵 Music notation engraving library for MEI with MusicXML and Humdrum support and various toolkits (JavaScript, Python)
https://www.verovio.org
GNU Lesser General Public License v3.0
660 stars 181 forks source link

Processing instructions for verovio options #3280

Open craigsapp opened 1 year ago

craigsapp commented 1 year ago

It would be useful to add parsing of processing instructions to allow embedding of verovio options within an MEI file. These processing instruction(s) would look something like this:

<?verovio
spacingLinear 0.4
spacingNonLinear 0.5
adjustPageHeight
adjustPageWidth
beamFrenchStyle
?>

Or:

<?verovio
spacingLinear=0.4
spacingNonLinear=0.5
adjustPageHeight
adjustPageWidth
beamFrenchStyle
?>

Whitespace should be preserved in the PI an the format of the contents of the PI is not XML (unless you want it to look like XML), but you could alternatively allow only one option set for each PI to simplify parsing:

<?verovio-option spacingLinear=0.4 ?>
<?verovio-option spacingNonLinear=0.5 ?>
<?verovio-option adjustPageHeight ?>
<?verovio-option adjustPageWidth ?>
<?verovio-option beamFrenchStyle ?>

When parsing command-line options or inputs via other toolkit versions such as Javascript and Python, the processing instructions embedded in the MEI file would be set first, and then any options coming from the interface would be set (overriding any options set with the PI. This would allow for optional display options for optimal rendering to be documented within the file. This would be difficult for boolean command-line options to be unset without adding something like --adjust-page-height-false, but not a problem in the Javascript toolkit setting of boolean options.

When the output is an MEI file, then it would be useful to add any input options given during conversion to be added as processing instructions.

Documentation about PI in puglixml:

https://pugixml.org/docs/manual.html

Processing instruction node (node_pi) represent processing instructions (PI) in XML. PI nodes have a name and an optional value, but do not have children/attributes. The example XML representation of a PI node is as follows: <?name value?> Here the name (also called PI target) is "name", and the value is "value". By default PI nodes are treated as non-essential part of XML markup and are not loaded during XML parsing. You can override this behavior with parse_pi flag.

Here is documentation about the parse_pi flag that needs to be added when parsing the XML data:

https://pugixml.org/docs/manual.html#parse_pi

parse_pi determines if processing instructions (nodes with type node_pi) are to be put in DOM tree. If this flag is off, they are not put in the tree, but are still parsed and checked for correctness. Note that <?xml …​?> (document declaration) is not considered to be a PI. This flag is off by default.

ahankinson commented 1 year ago

This is super-interesting! Thanks @craigsapp

Having a look around, it seems that the name="value" style is fairly widely used for processing instructions:

This means either:

<?verovio spacingLinear="0.4" adjustPageHeight="true" beamFrenchStyle="true" ?>

or

<?verovio spacingLinear="0.4" ?>
<?verovio adjustPageHeight="true" ?>
<?verovio beamFrenchStyle="true" ?>

I don't think we need <?verovio-option?> when we enumerate the latter, since it would still target Verovio. The advantage of the latter is that you can use it for repeatable options (not sure if there are any currently?). There's also nothing saying we can't specify both methods, other than the time it takes to implement and maintain both.

Whitespace should be preserved in the PI as [sic] the format of the contents of the PI is not XML

Conversely, since whitespace handling is so problematic within XML files, whether or not they're actually processing XML content, I would suggest making whitespace insignificant since you never know what shenanigans XML parsers get up to.

There should probably also be an option to ignore embedded processing instructions.

rettinghaus commented 1 year ago

What is expected to happen with contradicting information?

E.g., we have the PI:

<?verovio spacingSystem="10" ?>

in the encoding we find:

<scoreDef spacing.system="20" />

and as option we pass:

verovio --spacing-system 15

ahankinson commented 1 year ago

How do we resolve contradictions now?

rettinghaus commented 1 year ago

I think options overrule encodings? Although I'm not sure if this is consistent.

craigsapp commented 1 year ago

I think this way would be the easiest compared to a single PI with multiple options in it:

<?verovio spacingLinear="0.4" ?>
<?verovio adjustPageHeight="true" ?>
<?verovio beamFrenchStyle="true" ?>

The content of a processing instruction is plain text, so you have to parse the string yourself. The complicated part is dealing with options that contain a quote character when there are multiple options and one of then has an operand containing a quote character. Another problem is that some verovio options are repeatable, so making the options too xml-attribute-like will confuse people, for example multiple entries such as AppXPathQuery are allowed. Presumably single quotes would also be allowed to make the syntax xml-like:

<?verovio spacingLinear='0.4' ?>
<?verovio adjustPageHeight='true' ?>
<?verovio beamFrenchStyle='true' ?>

I think options overrule encodings? Although I'm not sure if this is consistent.

That would be the best thing to do: encoded values are overridden by PI options which are in turn overridden by command-line/toolkit options.

An interesting idea is to allow or require verovio PI at certain points in MEI. For example, if there are multiple <mdiv> then any verovio PIs inside of that mdiv only apply to that mdiv (in that case you would have to keep track of the options that are changed by PI, and reverse the changes before processing the next PI).

For a global location applying to all mdivs, it might be useful for them to to be present before the root node of XML data in a manner similar to the xml-model PIs:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://music-encoding.org/schema/dev/mei-all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="https://music-encoding.org/schema/dev/mei-all.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<?verovio spacingLinear="0.4" ?>
<?verovio adjustPageHeight="true" ?>
<?verovio beamFrenchStyle="true" ?>
<mei xmlns="http://www.music-encoding.org/ns/mei" meiversion="5.0.0-dev">

Note that the xml-model PI attributes are not real XML attributes (but XML parsing code could be recycled to parse them by sticking them into a fake XML document as a regular element to extract the values). But in this case the multiple uses of a single option make it hard to do that.


For show-and-tell, I use PI instructions to embed SCORE data for creating SVGs within the SVGs themselves. For example, on this page:

https://en.wikipedia.org/wiki/Piano_Sonata_No._8_(Beethoven)

The PNG images on that page are created from SVG images, such as:

https://upload.wikimedia.org/wikipedia/commons/5/57/Introduction_sonate_path%C3%A9tique.svg

Screenshot 2023-02-20 at 4 07 29 AM

If you view the source code for the SVG image, you will see a SCORE PI at the bottom of the file:

Screenshot 2023-02-20 at 3 53 37 AM

In this case there is one PI with all of the content as opposed to a separate PI for each musical object (represented by one, or sometimes two lines). On the first line of the PI I add a version number for which version of the SCORE editor the data is targeted to.

When the data is formatted to avoid line breaks in the XML attributes, I can even load the SVG image directly into SCORE as if it were a standard macro. Otherwise it is easy for me to copy-and paste the contents of the PI into the SCORE editor:

Screenshot 2023-02-20 at 4 17 48 AM
ahankinson commented 1 year ago

Another possibility is to embed a JSON object of the options in the processing instructions.

Advantages:


<?verovio {"spacingLinear": "0.4", "adjustPageHeight":"true", "someRepeatableOption": ["1", "2"]} ?>
craigsapp commented 1 year ago

JSON options sounds good -- the best advantage is that you do not need to write any extra parsing code for the verovio PI.

rettinghaus commented 1 year ago

Embedding a JSON object looks bad, and even if processing is way easier, the single option per PI example seems to be the best approach. The real question is: who would make use of this? To me it seems most likely that these would be added by hand, as we do currently with the extMeta element in the test suite.

craigsapp commented 1 year ago

as we do currently with the extMeta element in the test suite.

A ha. I was thinking that embedded options were implemented in verovio, but I was looking in the code for processing instructions rather than having them stored in extMeta as a CDATA comment.

Here are examples using this system:

https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/artic/artic-013.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/beamspan/beamspan-004.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/fing/fing-001.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/gracenote/gracenote-014.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/ligature/ligature-048.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/lyric/lyric-013.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/mdiv/mdiv-001.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/rend/rend-001.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/repeats/rpt-008.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/repeats/rpt-007.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/score/score-009.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/score/score-010.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/section/section-001.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/slur/slur-010.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/slur/slur-019.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/tie/tie-007.mei https://github.com/rism-digital/verovio.org/blob/gh-pages/_tests/tuplet/tuplet-021.mei

Example from tie-007:

<extMeta><![CDATA[ { "tieEndpointThickness": 0.25, "tieMidpointThickness": 0.25 }]]></extMeta>

Embedding a JSON object looks bad, and even if processing is way easier, the single option per PI example seems to be the best approach.

That extMeta content is JSON data :-)

I have just tested such embedded options, and they do not set the options when rendering with verovio. So you are using an external parser to extract the options which are then set with the toolkit options interface?

This system is ad-hoc and not semantically stable since there is no indication that these are verovio options, and of course this system seems to be external to verovio. Using processing instructions system would look similar, but more obvious what the content is meant for:

<?verovio { "tieEndpointThickness": 0.25, "tieMidpointThickness": 0.25 } ?>

There is a potentially interesting possibility of storing verovio PI's at the start of each mdiv (discussed above). That would be limited by storing the options in extMeta. For options that apply to all mdiv, it would be useful to search for them before/after the XML root node. Allowing them after the root note would be useful since you could concatenate a verovio PI to the end of an MEI data string to embed the options.

craigsapp commented 1 year ago

The real question is: who would make use of this?

My interest in such a feature is because I embed verovio options into Humdrum data:

Screenshot 2023-02-21 at 2 19 52 AM

https://verovio.humdrum.org?t=KiprZXJuCjY0R0wKNjRBCjY0Qgo2NGNKCjE2cnl5Cj0KKi0KISEhdmVyb3ZpbzogYmVhbUZyZW5jaFN0eWxlCiEhIXZlcm92aW86IHNwYWNpbmdOb25MaW5lYXIgMS4wCiEhIXZlcm92aW86IHN0YWZmTGluZVdpZHRoIDAuMwohISF2ZXJvdmlvOiBzdmdDc3MgZy5ub3RlIHsgZmlsbDogaG90cGluayB9IGcuc3RhZmYge2NvbG9yOiBvcmFuZ2V9Cg==

When this data is loaded into verovio, such options are set internally in verovio, so I am not extracting the options beforehand and then using the javascript toolkit to set the options through that interface. This has the advantage that the options are also present when loading the Humdrum data from the command line, or any other toolkit interface, such as the Python one.

When the Humdrum data is converted to MEI in VHV, these embedded options are currently lost:

Screenshot 2023-02-21 at 2 24 58 AM

In this case I converted the Humdrum data into MEI data with the toolkit, and then display in the text editor and then load the MEI data back into verovio. But now there are no embedded options being sent with the data. So hence this feature request.

The options are hidden within extMeta (see line 27 in the above image, and lines 31-32 which pull out the option into an element) where I document all Humdrum reference records, and I could probably extract them via the MEI importer, but a more standard way of doing it would be useful.

I don't see a big problem with using a JSON interface to embed multiple options. If humans are editing/reading the verovio PI, then the JSON data can be prettified:

<?verovio
{
    "spacingLinear": "0.4", 
    "adjustPageHeight": "true", 
    "someRepeatableOption": ["1", "2"]
}
?>

I only allow a single option setting per line in the Humdrum implementation. This is mostly for readability as @rettinghaus comments on above, but also for simplicity of parsing.

Another interesting concept I implemented is call "options sets" in the embedded verovio options for Humdrum: https://doc.verovio.humdrum.org/options. This allows for different options to be selected for different display purposes of the data. This concept might be useful to implement with embedded options as well.

lpugin commented 1 year ago

as we do currently with the extMeta element in the test suite.

A ha. I was thinking that embedded options were implemented in verovio, but I was looking in the code for processing instructions rather than having them stored in extMeta as a CDATA comment.

Current, the options in extMeta are not parsed by Verovio. However, we could and an option to trigger that. It does sound a bit silly to have an option to specify options, but I think it would make sense.

I looked at the processing instructions a while ago, and the main drawbacks have been underlined above (additional parsing code, and complication with repeated options), so I would be in favor of using the JSON options even with the processing instruction so that the parsing does not need to be re-implemented.

craigsapp commented 1 year ago

JSON options sound good to me.

What about the idea of allowing some sort of localized options? For example, global settings for the file can be placed above the <mei> root note as processing instructions. But I can also imagine that allowing separation option settings for each <mdiv> would be useful (such as to change the spacing for each movement independently), where ideally each local mdiv options settings would only affect the children of the mdiv and not be visible in other mdiv (when printing all mdivs at once).

<?xml ?>
<?verovio [global options] ?>
<mei>
   <meiHead>
      <extMeta>
         <?verovio [global options alternate storage location] ?>
      </extMeta>
   </meiHead>
<music>
   <body>
      <mdiv>
         <?verovio [mdiv-1 options] ?>
         <score/>
      </mdiv>
      <mdiv>
         <?verovio [mdiv-2 options] ?>
         <score/>
      </mdiv>
      <mdiv>
         <?verovio [mdiv-3 options] ?>
         <score/>
      </mdiv>
   </body>
</mei>
lpugin commented 1 year ago

Having mdiv-level options would yields all sorts of problems. First of all, changing many of the options in the middle of a score could be nonsense. For example, what would happen if you change the page size? Also, you would need to check that all the option changed in the middle are also (re-)set, or decide what behaviour to expect. That can be tricky. For example, if you change an option in mdiv-2, but not in mdiv-3, would you expect the value in mdiv-3 to be the original one, or the preceding mdiv-2 one? Whichever you decide, it can yield different results once you strat removing or adding mdivs with different options.

More importantly, @DavidBauer1984 is currently doing some refactoring that could potentially open the door to some parallel processing, which would be a valuable improvement. Since at this stage it is not very clear how this will be implemented, it would be preferable not to introduce some requirement that are very much single-thread designed, so I would strongly suggest that for such cases, users should have distinct MEI files and process them separately with the desired options.

There is also the problematic of the processing instruction handling. In pugixml, you need to enable it explicitly, because by default processing instruction are not parsed. I would be OK to check for them at the top but I am not keen at all to start allowing them in the middle of the document because this would yield all sorts of problems. (Typically, we would not want to keep them as "node" in the tree, because that can cause of sorts for typing and counting issues.)

Looking again at the processing instruction, having JSON in it would look unusual but still much better and simpler that doing pseudo-attribute values because it is problematic with multi-valued options and it would requires a internal attribute value parsing / setting to be implemented.

craigsapp commented 1 year ago

Having mdiv-level options would yields all sorts of problems.

OK. That can be considered later if anyone has a specific need for it.

There is also the problematic of the processing instruction handling. In pugixml, you need to enable it explicitly, because by default processing instruction are not parsed. I would be OK to check for them at the top but I am not keen at all to start allowing them in the middle of the document because this would yield all sorts of problems.

That sounds good to allow only before the root node.


Another way of storing options in an XML manner would be formalize verovio option content in extMeta, perhaps something like:

<extMeta>
     <verovio>
          <option name="adjustPageHeight" value="true" />
          <option name="spacingNonLinear" value="0.533" />
     </verovio>
</extMeta>

This would avoid having to activate PI inclusion flag into the XML parsing function. Also, multiple options with the same name could be given multiple times. This method would require minimal processing, just slightly more than JSON data. You would look for any <verovio> elements that are children of <extMeta> and then find the <option> elements that are children of <verovio>

It would probably be useful to allow multiple <verovio> elements in case there are options for various target renderings. I find this feature useful in the Humdrum interface for embedded verovio options.

<extMeta>
     <verovio>
        <! default options that would always be applied unless embedded options are ignored >
     </verovio>
     <verovio option-set="portrait">
          <option name="spacingNonLinear" value="0.4" />
     </verovio>
     <verovio option-set="landscape">
          <option name="landscape" value="true" />
          <option name="evenNoteSpacing" value="true" />
     </verovio>
</extMeta>

Then there could be an option added to the verovio toolkit interfaces which could be used to select which option set to apply, such as --embedded-options-set landscape. And of course an option to avoid using embedded options, such as --embedded-options-ignore.

craigsapp commented 1 year ago

Also, the option set feature would be able to handle cases where mdivs might need different options:

<extMeta>
     <verovio>
        <! default options that would always be applied unless embedded options are ignored >
     </verovio>
     <verovio option-set="mdiv1" />
          <option name="spacingNonLinear" value="0.4" />
     </verovio>
     <verovio option-set="mdiv2">
          <option name="landscape" value="true" />
          <option name="evenNoteSpacing" value="true" />
     </verovio>
</extMeta>

Then when printing all midivs at once, the default embedded options would be used, and if only printing one mdiv, an option such as --embedded-options-set mdiv1 or --embedded-options-set mdiv2 could be given to the toolkit.

craigsapp commented 1 year ago

I give examples with --embedded-options-set and <verovio option-set=""/>. Either option or options should be used (with or without s in both places for consistency).

craigsapp commented 1 year ago

Another possible enhancement could be to specify the verovio version number:

<extMeta>
     <verovio version=">3.12">
          <option name="marginTop" value="50" />
     </verovio>
     <verovio version="<=3.11">
          <option name="margnTop" value="100" />
     </verovio>
</extMeta>

This can be useful to handle cases when the option names or behaviors change between different versions of verovio, which will mostly be useful when running different versions of verovio on the test suite. If there is no version number, then it is assumed to be applicable to all versions of verovio. Of course, older versions of verovio would not know about this system (but the options are currently extracted extnerally to verovio, so you could implement parsing of this data to given such options to older versions of the toolkit in the current manner using the current extMeta embedded options system as a CDATA comment).

ahankinson commented 1 year ago

Having a <verovio> tag in the encoding would be extremely bad, IMO. The XML will fail validating unless the <verovio> tag is added to MEI itself, and that's a really bad move since it ties the encoding to a specific piece of software.

<extMeta> for the purposes of renderer options is already a hack, since the intention of the tag is to allow external, non-XML metadata (i.e., marc21) to be stored in the MEI files. It's not a free bucket where you can just chuck whatever you want into it.

So I'm a big thumbs down on that.

craigsapp commented 1 year ago

Having a <verovio> tag in the encoding would be extremely bad, IMO. The XML will fail validating unless the <verovio> tag is added to MEI itself, and that's a really bad move since it ties the encoding to a specific piece of software.

No. <extMeta> is for doing such things: MEI should ignore the contents of <extMeta> and not try to validate its content.

It's not a free bucket where you can just chuck whatever you want into it.

Is there documentation saying so?

https://music-encoding.org/guidelines/v4/elements/extmeta.html

"(extended metadata) – Provides a container element for non-MEI metadata formats."

Verovio options are metadata.

Also see this discussion: https://github.com/music-encoding/music-encoding/issues/314#issuecomment-229139893

ahankinson commented 1 year ago

By that argument having IE6-specific JavaScript in your websites was fine.

There's a reason why we shouldn't tie encoding contents to specific software applications. Processing instructions are kinda-sorta OK since they explicitly target specific applications, but adding software-specific options to documents is a bad idea, IMO.

Just because you can, doesn't mean you should.

craigsapp commented 1 year ago

By that argument having IE6-specific JavaScript in your websites was fine.

There's a reason why we shouldn't tie encoding contents to specific software applications. Processing instructions are kinda-sorta OK since they explicitly target specific applications, but adding software-specific options to documents is a bad idea, IMO.

Just because you can, doesn't mean you should.

I do not understand. Please provide MEI documentation about this since you are contradicting what @pe-ro said in the cited discussion. @pe-ro can also give his opinion on where/how verovio options should be embedded in MEI data according to any update in thinking about the use of <extMeta>.


I use <extMeta> for storing Humdrum reference records in MEI files, which is done for two purposes: (1) If there is no clear mapping to MEI metadata then the source encoding metadata can be referred to, and (2) for enabling round-trip conversions where the original metadata remains unaltered.

Example:

!!!OTL: Title of work
!!!COM: Composer of work
**kern
1c;
*-

MEI conversion:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="https://music-encoding.org/schema/dev/mei-all.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<?xml-model href="https://music-encoding.org/schema/dev/mei-all.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<mei xmlns="http://www.music-encoding.org/ns/mei" meiversion="5.0.0-dev">
 <meiHead>
  <fileDesc>
   <titleStmt>
    <title>Title of work</title>
   </titleStmt>
   <pubStmt />
  </fileDesc>
  <encodingDesc>
   <appInfo>
    <application isodate="2023-03-03T02:59:58" version="3.15.0-dev-dcf44a1">
     <name>Verovio</name>
     <p>Transcoded from Humdrum</p>
    </application>
   </appInfo>
  </encodingDesc>
  <workList>
   <work>
    <title xml:id="title-L1" analog="humdrum:OTL" type="main">Title of work</title>
    <composer analog="humdrum:COM" xml:id="person-L2">Composer of work</composer>
   </work>
  </workList>
  <extMeta>
   <frames xmlns="http://www.humdrum.org/ns/humxml">
    <metaFrame n="0" token="!!!OTL: Title of work" xml:id="L1">
     <frameInfo>
      <startTime float="0" />
      <frameType>reference</frameType>
      <referenceKey>OTL</referenceKey>
      <referenceValue>Title of work</referenceValue>
     </frameInfo>
    </metaFrame>
    <metaFrame n="1" token="!!!COM: Composer of work" xml:id="L2">
     <frameInfo>
      <startTime float="0" />
      <frameType>reference</frameType>
      <referenceKey>COM</referenceKey>
      <referenceValue>Composer of work</referenceValue>
     </frameInfo>
    </metaFrame>
   </frames>
  </extMeta>
 </meiHead>
 <music>
  <body>
   <mdiv xml:id="m1r2beg3">
    <score xml:id="sm83m1n">
     <scoreDef xml:id="sn5j893" midi.bpm="400.000000">
      <pgHead xml:id="pdmu8yc">
       <rend xml:id="r1dedykf" halign="center" valign="middle">
        <rend xml:id="r1h95cto" fontsize="large">Title of work</rend>
        <lb xml:id="leb4gq4" /> 
<lb xml:id="l363vx8" /> </rend>
       <rend xml:id="r1hjet2t" halign="right" valign="bottom" fontsize="small">Composer of work</rend>
      </pgHead>
      <staffGrp xml:id="s1say8u8">
       <staffDef xml:id="staffdef-L3F1" n="1" lines="5">
        <clef xml:id="cwjwvg4" shape="G" line="2" />
       </staffDef>
      </staffGrp>
     </scoreDef>
     <section xml:id="section-L3F1">
      <measure xml:id="measure-L1" right="invis">
       <staff xml:id="staff-L3F1" n="1">
        <layer xml:id="layer-L1F1N1" n="1">
         <note xml:id="note-L4F1" dur="1" oct="4" pname="c" accid.ges="n" />
        </layer>
       </staff>
       <fermata xml:id="fermata-L4F1" staff="1" startid="#note-L4F1" place="above" />
      </measure>
     </section>
    </score>
   </mdiv>
  </body>
 </music>
</mei>

Extracted <extMeta> content:

  <extMeta>
   <frames xmlns="http://www.humdrum.org/ns/humxml">
    <metaFrame n="0" token="!!!OTL: Title of work" xml:id="L1">
     <frameInfo>
      <startTime float="0" />
      <frameType>reference</frameType>
      <referenceKey>OTL</referenceKey>
      <referenceValue>Title of work</referenceValue>
     </frameInfo>
    </metaFrame>
    <metaFrame n="1" token="!!!COM: Composer of work" xml:id="L2">
     <frameInfo>
      <startTime float="0" />
      <frameType>reference</frameType>
      <referenceKey>COM</referenceKey>
      <referenceValue>Composer of work</referenceValue>
     </frameInfo>
    </metaFrame>
   </frames>
  </extMeta>

When there is a mapping to MEI metadata, then it is inserted into the MEI header:

   <titleStmt>
    <title>Title of work</title>
   </titleStmt>
   <work>
    <title xml:id="title-L1" analog="humdrum:OTL" type="main">Title of work</title>
    <composer analog="humdrum:COM" xml:id="person-L2">Composer of work</composer>
   </work>
      <pgHead xml:id="pdmu8yc">
       <rend xml:id="r1dedykf" halign="center" valign="middle">
        <rend xml:id="r1h95cto" fontsize="large">Title of work</rend>
        <lb xml:id="leb4gq4" /> 
<lb xml:id="l363vx8" /> </rend>
       <rend xml:id="r1hjet2t" halign="right" valign="bottom" fontsize="small">Composer of work</rend>
      </pgHead>
ahankinson commented 1 year ago

Sure; using <extMeta> for storing Humdrum metadata is fine. That's what it's supposed to be used for.

This conversation, I thought, was about passing options to Verovio for rendering. Those are two different things. Perry gave his opinion specifically about storing Humdrum metadata, but I feel you're being a bit disingenuous by then co-opting that as a precedent for including Verovio rendering options in the same place.

Verovio options are not metadata. You may consider them "metadata" in Humdrum/**kern, because they go in the header and you can do whatever you want with them because you control that format, but I don't think:

<verovio version=">3.12">
          <option name="marginTop" value="50" />
</verovio>

and

<frameInfo>
      <startTime float="0" />
      <frameType>reference</frameType>
      <referenceKey>OTL</referenceKey>
      <referenceValue>Title of work</referenceValue>
</frameInfo>

are in the same class of data. If you consider metadata to be information about the document (title, author, key, etc.) then the MEI documentation is already pretty clear about the use of <extMeta>. If, however, your definition of metadata is more expansive than mine to include information about how the document should be processed by a specific piece of software, then we're not going to solve it here.

ahankinson commented 1 year ago

Also a note that in the cited conversation Perry also says you can embed the entire humxml document in <extMeta>, but I don't think that was meant as an encouragement to do so -- it was simply to point out that there were no limitations to what you can store there. (His comments there did not have the sound of a ringing endorsement of this practice.)

You can probably agree with me that storing the entire document would be stretching the definition in the documentation:

"(extended metadata) – Provides a container element for non-MEI metadata formats."

So, again, I would take elements of that conversation and interpret them as "Just because you can, doesn't mean you should."

craigsapp commented 1 year ago

You should wait until @pe-ro gives his opinion. Yes, PI are designed for software-specific directions, but @pe-ro was more inclined that Humdrum metadata be placed in <extMeta> (note that verovio options are embedded in Humdrum reference records, and these entries already show up in the Humdrum <extMeta> content. The use of <extMeta> is software-specific since MEI has no idea how to deal with the contents of <extMeta>. If software such as verovio reads MEI data, and non-MEI XML data is desired for use in that software, the only place it is allowed to be stored is in <extMeta>.

However, placing XML content inside of a PI is possible, and could be done if embedding the options in a JSON string is not desired. Since PI content is an unparsed string, an entire XML file can be embedded within it (provided that ?> is not used). The string could be extracted from the PI and loaded into a separate xml_document in pugixml for the purposes of extracting the verovio options (and then discarded):

    pugi::xml_document verovioOptions;
    pugi::xml_parse_result result = verovioOptions.load_string(verovioPI.c_str());
    if (!result) {
        cerr << "Verovio processing instruction parsing error: " << result.description() << endl; 
        cerr << verovioPI.str();
        return;
    }

The above code would be used to load veorvio options from an XML-formatted string inside of the <?verovio ?> PI:

<?verovio
     <options>
          <option name="adjustPageHeight" value="true" />
          <option name="spacingNonLinear" value="0.533" />
     </options>
?>

Note that a root node would be required (called <options> here). I wonder if <?xml ?> would be required when using pugi::xml_document::load_string(). If so, then verovio can add it to the start of the extracted PI content so that it would not need to be encoded in the PI itself (since the PI cannot contain ?>).

lpugin commented 1 year ago

Having XML within the PI starts to be really weird....

I think having a single PI with JSON is acceptable. This means that

<?verovio
{
    pageWidth: 2300,
    scaleToPageSize: true,
}
?>

is something we could add. One question is if we need a way to disable that (meaning it would be enabled by default), or that people willing to disable it would need to pass each (default) option explicitly. Maybe a --pi-disable additional option would be simpler.

Additionally, we can add an option for the command-line version to load the options from a JSON file. Something like

verovio --option-file mdiv1.json input.mei

Then I think there would be plenty of ways for users to build the workflow they need.

craigsapp commented 1 year ago

I think that by default, parsing embedded options should be enabled. And of course, any command-line options (toolkit options) should be allowed to override the embedded options. In other words, the embedded options should be set first, and then the command-line/toolkit options. This might be problematic if options are set first in a vrvToolkit interface and then a data file is loaded wit embedded options.

--pi-disable is not so great because a general user would not understand what pi is. --embedded-options-disable is clearer.

I have not used --engraving-defaults-file, but is that not already similar to --option-file? In any case I do not have an immediate need of storing separate options for different mdivs, so such an enhancement can be implemented in the future if someone has a good reason to need it.

Then I think there would be plenty of ways for users to build the workflow they need.

The --options-file has the problem that there are two separate files (ideally the options would be embedded in the MEI file so that it is hard to lose the preferred options). And I wonder how --options-file would work in the javascript toolkit :-). There is --engraving-defaults, so there could be --options (but probably disable that option inside of the option string to avoid too much recursion that that might allow).


As an aside, here is JSON data:

<?verovio
{
    "pageWidth": 2300,
    "scaleToPageSize": true,
}
?>
lpugin commented 1 year ago

And I wonder how --options-file would work in the javascript toolkit :-)

by passing the content of the file to setOptions() ;-) I it more to fill a gap in the command-line tool, but if you do not think that is useful at this stage we can wait until there is a clear / better need for it.

craigsapp commented 1 year ago

Yes, at the moment it is overkill, but it is good to keep in mind future directions now rather than later.

frakel commented 11 months ago

One short follow-up suggestion from my side: it would be great to have the Verovio version directly in the JSON structure and not somewhere else (e.g. in the encodingDesc). This could be attained with a short extension of the already advocated PI/JSON format:

<?verovio
{
    "3.17": {"spacing-linear": "0.15", "lyric-size": "4.8", "lyric-word-space": "1"}
}
 ?>