globalwordnet / schemas

WordNet-LMF formats
https://globalwordnet.github.io/schemas/
20 stars 11 forks source link

Add xml:space attribute for WN-LMF format #70

Closed goodmami closed 1 year ago

goodmami commented 1 year ago

If a wordnet author wishes to ensure whitespace is preserved in things like examples, definitions, etc. in the WN-LMF format, they should use the xml:space attribute with the value "preserve", but this attribute must be declared in the schema if it is to be used in a valid document. See https://github.com/goodmami/wn/issues/151#issuecomment-1135408720 for further discussion.

I'm not advocating for or against this attribute's inclusion as I don't know if there's a real need, but just raising the issue for discussion since it came up in https://github.com/goodmami/wn/issues/151.

jmccrae commented 1 year ago

I don't see why we can't add this, it seems a fairly easy modification that would not affect backwards compatibility. Should this be only on the <Example> and <Definition> tags or do we allow this at a document level?

Our assumption is that most examples/definitions do not preserve spacing, right?

goodmami commented 1 year ago

Should this be only on the <Example> and <Definition> tags or do we allow this at a document level?

Also maybe <ILIDefinition>. I think any free-form text might be good, but I can't imagine it being useful on <Lemma>, <Form>, <Tag>, <Count>, etc.

Our assumption is that most examples/definitions do not preserve spacing, right?

I think so. I was wondering if things like poetry that need some whitespace formatting might end up in an example or definition, but that seems unlikely.

jmccrae commented 1 year ago

Does this PR fix the issue in a way that works for you @goodmami ?

goodmami commented 1 year ago

@jmccrae thanks, partially, but I think I was being unclear:

I think any free-form text might be good, I can't imagine it being useful on <Lemma>, <Form>, <Tag>, <Count>, etc.

I should have said "something to consider" instead of "good". Also, I was trying to say that we don't need it for the elements listed above. Furthermore, <Lemma> and <Form> have their values in the writtenForm attribute and not between the element tags, so xml:space="preserve" wouldn't even do the intended thing for them. For the PR, let's just stick with <Definition>, <ILIDefinition>, and <Example> for now.

jmccrae commented 1 year ago

Okay, I have updated the PR. Thanks for the comment