Closed nataliacp closed 1 year ago
It's expected by pycldf
:
https://github.com/cldf/pycldf/blob/afc2cb0887fdb9f31b1c5c32c2fa7ad7b28ddab4/src/pycldf/components/ValueTable-metadata.json#L40-L46
However, I realize that this constraint is not specified in the ontology. So at this point, I'd say it's an omission in the spec.
Conceptually, though, what would it mean for one datapoint to fall into multiple bins? If the idea is that different constructs of one language fall into mulitple bins, I'd say multiple ValueTable rows would convey this more clearly.
Btw. "multiple Values for the same (language, parameter) pair" is the solution chosen by APiCS and it's supported by tools like cldfviz map
thanks for the quick reply. These are variables that have a list of states, such as which parts of speech receive a particular plural suffix. We had adopted the solution of the separator, as per our previous discussion here https://github.com/cldf/cldf/issues/109#issuecomment-831903639 But I understand the logic of your other proposal too. I will review all such cases in our dataset and will report back if there is anything that conceptually would be a different case.
Overall, I think list-valued foreign keys are often problematic, because they make it impossible to attach more data to the relation. E.g. in the APiCS case, values for a particular code also carry a Frequency
property. In the list-valued codeReference case there would be no place for this - other than adding another list-valued property, and the implicit assumption that order in both lists is significant and relates Frequency and Code ...
I understand. I had a look at APiCS maps and they look great for our purposes too! So, we are going to split the multi-values in different rows. And now I have a follow-up question. Is frequency for the pies that are displayed calculated automatically by clld (when one wants an equal split) or it is specified in the values.csv (e.g. in https://apics-online.info/parameters/43#2/30.3/10.0 there are equally-split pies and more real frequency-like pies).
Pie slices are computed from frequency values. See https://clldutils.readthedocs.io/en/latest/svg.html#clldutils.svg.pie e.g.:
>>> from clldutils import svg
>>> print(svg.pie([20, 80], ['#aa0000', '#00aa00']))
<svg xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink" height="34" width="34">
<path d="M17.0,17.0 L1.0,17.0 A16.0,16.0 0 0,1 12.1 1.8 L17.0,17.0" style="fill:#AA0000;stroke:none;" transform="rotate(90 17.0 17.0)"></path><path d="M17.0,17.0 L12.1,1.8 A16.0,16.0 0 1,1 1.0 17.0 L17.0,17.0" style="fill:#00AA00;stroke:none;" transform="rotate(90 17.0 17.0)"></path>
</svg>
If you wanted to use number of values per code as weight, you'd just use 1
as frequency for each value.
thank you very much Robert! this is very helpful. We will try to make it work like Apics.
We have declared a separator (;) for the codeReference column in values.csv but we are getting a warning when running cldf validate. The warning is: WARNING http://cldf.clld.org/v1.0/terms.rdf#ValueTable http://cldf.clld.org/v1.0/terms.rdf#codeReference must be singlevalued
is this expected behavior? thanks!