cldf-datasets / gata

Creative Commons Attribution 4.0 International
0 stars 1 forks source link

ValueTable includes R-isms "NA" as a value. #18

Open SimonGreenhill opened 11 months ago

SimonGreenhill commented 11 months ago

The ValueTable (values.csv) has multiple values coded as "NA" which is due to R being used somewhere in the pipeline. It would be better to code these as "" or tag these values as null in the StructureDataset definition (where ? and <empty string> are already defined).

MuffinLinwist commented 11 months ago

Thanks, @SimonGreenhill, I'm fixing this right now.

FredericBlum commented 11 months ago

Hi @SimonGreenhill , thanks for checking the data. The "NA" entries are explicitly coded as auch and not the same as "". We coded this in cases in which the question does not apply, e.g. when we ask for order of N Adj, the language may not have adjectives, so we code NA. We code "" if the topic is simply not discussed in the grammar. Do you recommend another form of coding this, so we do not confuse this with missing values?

MuffinLinwist commented 8 months ago

We are still pending on this response to fix this issue with the latest release, @FredericBlum and @SimonGreenhill.