Closed CarolinOdebrecht closed 4 years ago
This is linked with https://github.com/distantreading/WG1/issues/21
Certainly the ODD and the documentation should be in step! We don't seem to have yet reached any clear consensus on what counts as "high" or "low", and the numbers are likely to be different in different contexts anyway. We do however agree that we need those two values at least.
I propose to
At present the schema also allows for "medium", Should we keep that or change it to "unmarked" if used?
A binary decision is easier to handle. Introducing a category "unmarked" is also a good idea.
So, at present, we allow high, low, unmarked, and unspecified. But if the value is unmarked it is ipso facto unspecified. And if after doing their best an encoder can only say something is unspecified, the effect for the user is just the same as if it was unmarked. The two are effectively synonymous. Since we use "unspecified" elsewhere, I propose to remove "unmarked" from the list of possible values and make the headChecker script convert any "medium" or "unmarked" values into "unspecified".
Ok.
Closing this, as we are in agreement!
We have different regulations on the reprint count:
The ODD says:
The sampling document says:
This is crucial for "medium" and "high" The sampling document version is more restricted but clearer. It might be easier to just approximate reprint counts. We cannot assum that occasionally is the same for every language.