smashub / choco

ChoCo: the Chord Corpus
Other
61 stars 6 forks source link

Bug in the conversion of key_mode annotations #116

Closed rubencart closed 2 months ago

rubencart commented 6 months ago

jams.load('data/.../billboard_10.jams') gives:

Traceback (most recent call last):
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 774, in validate
    schema.VALIDATOR.validate(data_ser, ann_schema)
  File "/cw/liir_code/NoCsBack/rubenc/miniconda3/envs/harmonyenv/lib/python3.12/site-packages/jsonschema/validators.py", line 353, in validate
    raise error
jsonschema.exceptions.ValidationError: -35.884433106499955 is less than the minimum of 0.0
Failed validating 'minimum' in schema['items']['properties']['duration']:
    {'minimum': 0.0, 'type': 'number'}
On instance[25]['duration']:
    -35.884433106499955
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  ...
    audio_jams = jams.load(str(path))
                 ^^^^^^^^^^^^^^^^^^^^
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 216, in load
    jam.validate(strict=strict)
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 1813, in validate
    valid &= ann.validate(strict=strict)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 778, in validate
    raise SchemaError(str(invalid))
jams.exceptions.SchemaError: -35.884433106499955 is less than the minimum of 0.0
Failed validating 'minimum' in schema['items']['properties']['duration']:
    {'minimum': 0.0, 'type': 'number'}
On instance[25]['duration']:
    -35.884433106499955
rubencart commented 6 months ago

Another error on loading another sample.

jams.load('data/.../jazz-corpus_10.jams') gives:

Traceback (most recent call last):
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 774, in validate
    schema.VALIDATOR.validate(data_ser, ann_schema)
  File "/cw/liir_code/NoCsBack/rubenc/miniconda3/envs/harmonyenv/lib/python3.12/site-packages/jsonschema/validators.py", line 353, in validate
    raise error
jsonschema.exceptions.ValidationError: 'C:majoror' does not match '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$'
Failed validating 'pattern' in schema['items']['properties']['value']:
    {'pattern': '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$',
     'type': 'string'}
On instance[0]['value']:
    'C:majoror'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  ...
    audio_jams = jams.load(str(path))
                 ^^^^^^^^^^^^^^^^^^^^
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 216, in load
    jam.validate(strict=strict)
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 1813, in validate
    valid &= ann.validate(strict=strict)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cw/liir_code/NoCsBack/rubenc/jams/jams/core.py", line 778, in validate
    raise SchemaError(str(invalid))
jams.exceptions.SchemaError: 'C:majoror' does not match '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$'
Failed validating 'pattern' in schema['items']['properties']['value']:
    {'pattern': '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$',
     'type': 'string'}
On instance[0]['value']:
    'C:majoror'
rubencart commented 6 months ago

Idem for jazz-corpus_100.jams on 'C#:minoror'.

rubencart commented 6 months ago

More errors:

when-in-rome_105.jams: SchemaError("'A major:I' is not of type 'object'\n\nFailed validating 'type' in schema['items']['properties']['value']:\n    {'properties': {'chord': {'pattern': '^([b#]?(i|I|ii|II|iii|III|iv|IV|v|V|vi|VI|vii|VII))[osdhx+]?[0-9]?[0-9]?(/([b#]?(i|I|ii|II|iii|III|iv|IV|v|V|vi|VI|vii|VII)))?$',\n                              'type': 'string'},\n                    'tonic': {'pattern': '^[A-G][b#]?$', 'type': 'string'}},\n     'required': ['tonic', 'chord'],\n     'type': 'object'}\n\nOn instance[0]['value']:\n    'A major:I'")
weimar_107.jams: SchemaError("'F:majoror' does not match '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$'\n\nFailed validating 'pattern' in schema['items']['properties']['value']:\n    {'pattern': '^N|([A-G][b#]?)(:(major|minor|ionian|dorian|phrygian|lydian|mixolydian|aeolian|locrian))?$',\n     'type': 'string'}\n\nOn instance[0]['value']:\n    'F:majoror'")

There are many more. Please tell me if I'm doing something wrong. But it seems weird that I can't even load so many of the files in your dataset.

Note that I am running

        jams.schema.add_namespace(str(ns_dir / "chord_ireal.json"))
        jams.schema.add_namespace(str(ns_dir / "chord_jparser_harte.json"))
        jams.schema.add_namespace(str(ns_dir / "chord_jparser_functional.json"))
        jams.schema.add_namespace(str(ns_dir / "chord_m21_leadsheet.json"))
        jams.schema.add_namespace(str(ns_dir / "chord_m21_abc.json"))
        jams.schema.add_namespace(str(ns_dir / "chord_weimar.json"))
        jams.schema.add_namespace(str(ns_dir / "timesig.json"))

before loading any of the jams files.

rubencart commented 6 months ago
rock-corpus_105.jams: SchemaError("None is not of type 'number'\n\nFailed validating 'type' in schema['properties']['file_metadata']['properties']['duration']:\n    {'minimum': 0.0, 'type': 'number'}\n\nOn instance['file_metadata']['duration']:\n    None")
real-book_1004.jams: SchemaError("'G:hdim' does not match '^((N)|(([A-G][b#]*)((:(maj|min|dim|aug|maj7|min7|7|dim7|hdim7|minmaj7|maj6|min6|9|maj9|min9|sus4)(\\\\((\\\\*?([b#]*([1-9]|1[0-3]?))(,\\\\*?([b#]*([1-9]|1[0-3]?)))*)\\\\))?)|(:\\\\((\\\\*?([b#]*([1-9]|1[0-3]?))(,\\\\*?([b#]*([1-9]|1[0-3]?)))*)\\\\)))?((/([b#]*([1-9]|1[0-3]?)))?)?))$'\n\nFailed validating 'pattern' in schema['items']['properties']['value']:\n    {'pattern': '^((N)|(([A-G][b#]*)((:(maj|min|dim|aug|maj7|min7|7|dim7|hdim7|minmaj7|maj6|min6|9|maj9|min9|sus4)(\\\\((\\\\*?([b#]*([1-9]|1[0-3]?))(,\\\\*?([b#]*([1-9]|1[0-3]?)))*)\\\\))?)|(:\\\\((\\\\*?([b#]*([1-9]|1[0-3]?))(,\\\\*?([b#]*([1-9]|1[0-3]?)))*)\\\\)))?((/([b#]*([1-9]|1[0-3]?)))?)?))$',\n     'type': 'string'}\n\nOn instance[20]['value']:\n    'G:hdim'")
real-book_1003.jams: SchemaError("['Wayne Shorter'] is not of type 'string'\n\nFailed validating 'type' in schema['properties']['file_metadata']['properties']['artist']:\n    {'type': 'string'}\n\nOn instance['file_metadata']['artist']:\n    ['Wayne Shorter']")
mozart-piano-sonatas_16.jams: SchemaError("'A:I' is not of type 'object'\n\nFailed validating 'type' in schema['items']['properties']['value']:\n    {'properties': {'chord': {'pattern': '^([b#]?(i|I|ii|II|iii|III|iv|IV|v|V|vi|VI|vii|VII))[osdhx+]?[0-9]?[0-9]?(/([b#]?(i|I|ii|II|iii|III|iv|IV|v|V|vi|VI|vii|VII)))?$',\n                              'type': 'string'},\n                    'tonic': {'pattern': '^[A-G][b#]?$', 'type': 'string'}},\n     'required': ['tonic', 'chord'],\n     'type': 'object'}\n\nOn instance[0]['value']:\n    'A:I'")
andreamust commented 6 months ago

Dear @rubencart, thanks for reporting the issue. Some of the chords and tonalities you reported are clearly badly formatted. We will fix these in a future release. In any case, you should get rid of any errors when loading the jams file by setting strict=False and validate=False. Please, refer to the jams library documentation for more information.

rubencart commented 6 months ago

Thank you that is helpful!

andreamust commented 2 months ago

The issue regarding the wrong key annotations has been solved and the files regenerated accordingly:

Fixes will be included in the next stable release (v1.1.0). Please, refer to this issue https://github.com/smashub/choco/issues/105 to keep track of the changes that will be included in the new release.