TAVERN Cadenzas - use long fake long measures in rntxt analyses

MarkGotham / When-in-Rome

meta-corpus of and code library for the functional harmonic analysis of music

58 stars 11 forks source link

TAVERN Cadenzas - use long fake long measures in rntxt analyses #40

Open jonnybluesman opened 2 years ago

jonnybluesman commented 2 years ago

Hello, I am currently parsing the analysis files in the corpus via the music21 converter. By doing this, I found a couple of annotations with potentially inconsistent timings (see below).

Etudes_and_Preludes/Bach,_Johann_Sebastian/The_Well-Tempered_Clavier_I/03/analysis.txt, which raises:
- music21.romanText.translate.RomanTextTranslateException: too many notes in this measure: m103 I64 b3 V7
Variations_and_Grounds/Mozart,_Wolfgang_Amadeus/_/K398/analysis_A.txt, which raises:
- music21.romanText.translate.RomanTextTranslateException: too many notes in this measure: m156 I64 b7 I64 b13 I64 b19 I64
- music21.romanText.translate.RomanTextTranslateException: too many notes in this measure: m161 I64 b5 I64 b9 I64 b11 I64 b15 I64

Hope this helps, and thanks a lot for this fantastic corpus!

MarkGotham commented 2 years ago

Hi @jonnybluesman! Thanks for reporting these.

The Bach is due to a recent implementation of 'fast' vs 'slow' 3/X time (see the music21 issue https://github.com/cuthbertLab/music21/issues/1185). Should be improved now.
The Mozart might be a bigger issue with cadenzas or something. I'll look into it as part of a wider attempt to make sure score'n'analysis are always there and always match up.

Let us know if you spot anything else.

MarkGotham commented 2 years ago

@napulen can you update us on your solutions to Mozart variation cadenzas?

napulen commented 2 years ago

I believe my general solution to cadenzas was to either:

Remove the cadenza from the score, verifying that the rntxt would align in measure counts with the modified score
Preserve the cadenza in the score, and hardcode a Time Signature: change in the rntxt file for that measure, so that the duration of the measure (with all the cadenza content) would match the time signature in rntxt.

I did this arbitrarily for what best served the purpose of training the machine learning model, so I wouldn't impose the specific decisions here.

MarkGotham commented 1 year ago

Hey @napulen,

We now have a consistent system for keeping the "original conversion" (if you'll permit the oxymoronic turn of phrase ...) alongside an alteration. See for example changes to the Beethoven piano sonata + BPS original.

Let's frame this that way, so ...

Yes please to your choice of edits ...
Along with an analysis_TAVERN_A.txt file for the unmodified form.

In all cases analysis.txt is the recommended version.

Thanks!

(p.s. Stay tuned for flexible run-time alignment of score and analysis)

napulen commented 1 year ago

I'm unable to provide a pull request at this time, but all matters TAVERN for AugmentedNet are clearly laid out in this file, which is the authority for everything that the neural network is using: https://github.com/napulen/AugmentedNet/blob/main/AugmentedNet/data/tavern.py

Note that sometimes, the entry will point to the "original conversion", whereas sometimes it will point to a so-called "correction", as in this case:

    "tavern-mozart-k353-a": (
        "rawdata/corrections/WiR/Corpus/Variations_and_Grounds/Mozart,_Wolfgang_Amadeus/_/K353/analysis_A.txt",
        "rawdata/corrections/Tavern/Mozart/K353.mxl",
    ),

Those corrections are often pairs of modified RomanText and MusicXML files. The tuples in that python module are to be trusted and always point to a score-annotation pair that have been manually verified to deliver a minimum amount of errors/misalignment.

napulen commented 1 year ago

All data, including the "corrected" (modified is a better term) pairs are publicly available in the repository.