grambank / pygrambank

Apache License 2.0
4 stars 1 forks source link

Add sanity check for some binarised features #114

Closed HedvigS closed 1 year ago

HedvigS commented 1 year ago

Add quality check for grambank sheets. The following features have the property that each pair cannot both be 0. I.e. GB024a and Gb024b cannot BOTH be 0. One or both can be missing, one or both can be ?, they can both be 1 but they CANNOT both be 0.

ID Name
GB024a Is the order of the numeral and noun Num-N?
GB024b Is the order of the numeral and noun N-Num?
GB025a Is the order of the adnominal demonstrative and noun Dem-N?
GB025b Is the order of the adnominal demonstrative and noun N-Dem?
GB065a Is the pragmatically unmarked order of adnominal possessor noun and possessed noun PSR-PSD?
GB065b Is the pragmatically unmarked order of adnominal possessor noun and possessed noun PSD-PSR?
GB130a Is the pragmatically unmarked order of S and V in intransitive clauses S-V?
GB130b Is the pragmatically unmarked order of S and V in intransitive clauses V-S?

This is not true of the other binarised features (GB193a, GB193b, GB203a and GB203b). They can both be 0 (the language can have nothing adjective-like and nothing like a universal quant).

johenglisch commented 1 year ago

113 already got ya covered. (^^)

https://github.com/grambank/pygrambank/blob/4a5c8ca3f355b4d0dac0aaea5f4683e4deb3feb5/src/pygrambank/sheet.py#L138-L142

The only thing was that the program still thought GB193 was a three-state feature. I fixed that.

HedvigS commented 1 year ago

Ah splendid! I went looking in sheet.py but I didn't search for the right things.

Many thanks!!