MarkGotham / When-in-Rome

meta-corpus of and code library for the functional harmonic analysis of music
58 stars 11 forks source link

Can't parse Beethoven quartets #66

Closed giamic closed 1 year ago

giamic commented 1 year ago

The score of the Beethoven quartets can't be parsed. I receive the following error:

rs/micchig/PycharmProjects/When-in-Rome/Corpus/Quartets/Beethoven,_Ludwig_van/Op059_No2/4/
Error with: /Users/micchig/PycharmProjects/When-in-Rome/Corpus/Quartets/Beethoven,_Ludwig_van/Op059_No2/4/. not a valid pitch specification: e.VI

Different scores give similar errors, with the pattern being that the pitch specification key.degree is not valid

MarkGotham commented 1 year ago

Thanks for flagging this.

These are the DCML quartet scores with the analysis on score as chord labels and I'm guessing it's raising an error on the first such label encountered in each file? If so, likely to have the same issue with their other corpora.

In v1 they had the format .e.I (i.e., with leading dot). This music21 (and also MuseScore for that matter), knew to not try and parse as an actual chord label. I think the change to e.I is associated with the move from code symbols to Roman numerals. Right, @johentsch? Several possible solutions:

  1. modify music21 to parse the analysis with only a warning instead of raising an error. That was accepted as a solution for an issue with dynamics, though getting that kind of thing into the central music21 repo is harder these days, so we'd potentially face hosting a version ourselves (dis-preferred!).
  2. take the DCML scores without analyses - @johentsch is that version public somewhere?
  3. failing 2, provide such a version here, adding some benefit to the duplication.
  4. Go back to DCML v1 (dis-preferred, assuming v2 is better!)
  5. use the ms parser. Would add another dependency, but may be worth it. @johentsch:
    • is that an appropriate and equivalent tool for this?
    • any recommendations for how to use?
    • any prospects for it getting integrated with music21?
  6. No doubt, other possibilities too.
johentsch commented 1 year ago

Hi, good timing, I've just thrown out v2 of the Mozart sonatas and ABC is the next thing I'm tackling.

@giamic I'm missing the context of what you're trying to achieve but this branch has a more up-to-date version of the ABC which I'm transforming into the next release these day.

MarkGotham commented 1 year ago

Thanks @johentsch !

I think it's a question of your musescore converting your files to mxl fine but including in those mxl files the analyses in a way that music21 will then not parse. We get the analysis from your tsp files, so don't need them on the score.

I'm open to any of the workarounds sketched above, e.g., if your parser can provide MuseScore-to-musicxml conversion, and also (optionally) remove the labels, then that would be as about as efficient as realistically possible here, right? We could even slip the DCML-to-rntxt conversion into that process and drop the tav files altogether?

giamic commented 1 year ago

Just to confirm that the problem is exactly what Mark said.

MarkGotham commented 1 year ago

An update on testing this issue ... I've checked:

  1. the current syntax F.I and also a (manual change to) the previous one .F.I
  2. starting from the mxl files here, and also from the DCML mscx originals.
  3. reverting to MuseScore chord labels.

Every case raises an error:

  1. music21.harmony.HarmonyException: not a valid pitch specification
  2. music21.harmony.HarmonyException: not a valid pitch specification
  3. music21.pitch.PitchException: Cannot make a step out of
giamic commented 1 year ago

I'm eager to see the new version of the data!

For the time being, I think that removing the analysis from the score is the way to go in my opinion; we could easily update the score files in our repo to remove the lyrics line I guess

napulen commented 1 year ago

Interesting. I just checked and it seems I have been parsing these files with the corresponding annotations without a problem.

Important note, I am using music21==6.7.1. In which version are you getting this error? Would you be able to try 6.7.1?

Also, I have always relied on the (converted) mxl files rather than the tsv files for DCML datasets. The location of the annotation is not always encoded in the MusicXML, but this can be fixed by adding an additional staff with rests and cut-and-pasting the annotations into that staff. An example is available here for Mozart sonatas: https://github.com/napulen/mozart_piano_sonatas/blob/0df9971d37ff28bee2b5e564c0dba562c2fab443/mxl/K279-1.mxl

In my experience, this has always lead to an easier alignment and easier detection of misalignments.

johentsch commented 1 year ago

I'm eager to see the new version of the data!

@giamic, version 2.0 is out, check it out....

MarkGotham commented 1 year ago

Issue:

Status:

Action: