Closed SimonGreenhill closed 1 year ago
Okay, I see.
Do you want language_level_df()
to continue living here in rgrambank or move over to rcldf? The function is written so that if there's a ValueTable and a LanguageTable with the ID and Language_ID/Language_level_ID() it'll work, regardless if it's grambank or some other cldf-dataset with ValueTable.
AFAIKT, the function also relies on Language_ID
in ValueTable
being Glottocodes. This is an assumption that's not generally true for CLDF datasets, so I'd say the function cannot move to rcldf.
AFAIKT, the function also relies on
Language_ID
inValueTable
being Glottocodes. This is an assumption that's not generally true for CLDF datasets, so I'd say the function cannot move to rcldf.
Currently, it assumes that Language_ID in ValueTable matches Glottocode in LanguageTable. This would not be true for other cldf-datasets. If @SimonGreenhill would be happy to support moving language_level_df() to rcldf, I'll rejig that. i'll probably rejig that anyway, but it'd be good to know if it'd stay here or move.
It's not difficult to change this, I'll take a stab at it right now.
@HedvigS the other reason why this function shouldn't move to rcldf is that there is no notion of "Language_level_ID" in CLDF. rcldf should be concerned only with columns that are specified in https://cldf.clld.org/v1.0/terms.html (And while Language_level_ID can be considered a "languageReference" property, the semantics of this relation are not fixed in CLDF if the reference appears in a non-standard place. So a "languageReference" column in ValueTable relates the value to a language it applies to, but a "languagereference" in LanguageTable may be different things, the direct parent, or the top-level family, ...)
@HedvigS the other reason why this function shouldn't move to rcldf is that there is no notion of "Language_level_ID" in CLDF. rcldf should be concerned only with columns that are specified in https://cldf.clld.org/v1.0/terms.html (And while Language_level_ID can be considered a "languageReference" property, the semantics of this relation are not fixed in CLDF if the reference appears in a non-standard place. So a "languageReference" column in ValueTable relates the value to a language it applies to, but a "languagereference" in LanguageTable may be different things, the direct parent, or the top-level family, ...)
Yes, you're right. I was thinking a bit about this last night. I was thinking that the function could be modified so if the LanguageTable of a given CLDF-dataset doesn't contain a Language_level_ID type column, it could first match ID to Glottocode in that LanguageTable and then use another reference table to get Language_level_ID for each Glottocode, for example just glottolog-cldf. That's something I think would improve the function, whether it lives in rgrambank or rcldf. Regardless of where it lives, I know that I in future would appreciate such a function. If @SimonGreenhill decides that it continues to live in rgrambank, that's okay I guess I'll just use rgrambank functions on non-grambank cldf-data. I use functions from psychology and biology on linguistics data already, so even if it's a bit funny it'll be okay :)
done in #22
Remove the hardcoded table from glottolog. We should decouple this because glottolog evolves, and we don't want to be updating this every single time glottolog releases a new version. The package here has a function to easily load_glottolog() so we can make users use that e.g.