Closed mikabr closed 8 months ago
It seems like researchers sometimes use forms in non-standard ways, including allowing understands
for forms labelled as WS—maybe we should respect the forms' "original intent"? Otherwise we would potentially need to modify form_type
in the future if someone contributes a new dataset that includes understands
. This would mean that there is some wiggle room in defining what "original intent" is, though, so perhaps it's worth discussing.
I'm not sure how to thing about original intent, but from a data point of view, we need to distinguish between datasets that include comprehension and ones that don't. So datasets that are in theory WS but include comprehension need to classified as "WG-like".
Decision from discussion -- the issue is only coming for a few relatively small datasets, so we'll let it be for now and potentially fix it later is becomes a bigger issue.
decision - we will not fix this right now.
The distinction between WG-type and WS-type instruments is whether they allow for both
understands
andproduces
values for words or for onlyproduces
values. There are currently 7 datasets that are using instruments coded as WS-type, but allow forunderstands
values for words (according to their corresponding_values.csv
file), which violates this assumption:Out of the 6 instruments that there 7 datasets are using, 4 of them are only used by datasets that allow
understands
values, so these instruments should just be reclassified as WG-type:For the two remaining two instruments,
English (American) WS
andKorean WS
, some datasets allow forunderstands
values and some don't. This is tricker but probably means that the datasets that do allow forunderstands
values should be split off into separate instruments from the datasets that don't, specificallyArmon-Lotem
forEnglish (American) WS
andYim
forKorean WS
.