oasis-tcs / lexidma

OASIS Lexicographic Infrastructure Data Model and API (LEXIDMA) TC: A repository designed for use in development of TC chartered work products and test suites. https://github.com/oasis-tcs/lexidma
Other
7 stars 8 forks source link

Should `collocateMarker` have a uniqueness constraints? #122

Closed jmccrae closed 3 months ago

jmccrae commented 5 months ago

Currently collocateMarker has a uniqueness constraint on the lemma field. This means the following is invalid according to the spec

<entry id="frog">
  <headword>frog</headword>
  <example>
    <text><collocateMarker lemma="dog">dogs</collocateMarker> used to sit on <headwordMarker>frogs</headwordMarker> and now <collocateMarker lemma="dog">dogs</collocateMarker> sit on logs</text>
  </example>
</entry>

This is an error in my opinion.

Relates to the implementation of #97

michmech commented 5 months ago

True. The lemma shouldn't be part of a collocateMarker's uniqueness constraint.

I think it's logical that the collocateMarker's startIndex and endIndex should be its uniqueness constraint: it should be prohibited to have two collocate markers on the same thing (on the same substring).

I have made this change in the prose.

jmccrae commented 4 months ago

If we make startIndex and endIndex the unique properties, we should also apply this to all other markers, right?

michmech commented 4 months ago

You're right!