autotyp / autotyp-data

AUTOTYP data export
Creative Commons Attribution 4.0 International
38 stars 20 forks source link

Mismatch between data and metadata for Register OriginContinent #23

Closed xrotwang closed 2 years ago

xrotwang commented 2 years ago

The metadata for Register->OriginContinent says

    data: value-list
    values: 
      "Africa": ...
      "W and SW Eurasia": ...
    ...

but the JSON data has strings, with the following values:

Eurasia 1133
Africa 686
North America 433
South America 303
Pacific 292
Australia 189
None 5
tzakharko commented 2 years ago

Yes, value-list just means values from a controlled vocabulary. I understand that the description might have been slightly confusing. We will update it for a future release. The repeated (real lists) data is declared as list-of (example: VerbIncorporation in VerbSynthesis.json)

xrotwang commented 2 years ago

Sorry for conflating two issues here. The primary issue still stands: Some values in the data are not in the controlled vocabulary, e.g. South America is not in https://github.com/autotyp/autotyp-data/blob/88133f01232702659bac179baafbb99d3adc1727/metadata/Register.yaml#L231-L267

tzakharko commented 2 years ago

Ah, sorry, I misunderstood. Yes, that looks like a bug in the data validation pipeline. There is some work to be still done on that front, the fix (along with proper data dependency annotations) should come by the end of spring.

tzakharko commented 2 years ago

Thank you for the report! I have created #30 to track this and related issues. Should you want to report more problems of this kind, please post them there.