Closed 1313ou closed 4 years ago
Didn't you already report this as #180?
You're right I had noted the extra garbage in the ids but I hadn't noted then the implications, that you go against the very idea of a LexicalEntry (lexical entries are split not only across the different lexfiles but also within the same file) that drifts into a grouping of senses.
That said, I managed to automate the fix/merging through XSL3.0 in what I think is an elegant way. I need some time for extra checks.
Shall I generate a PR or keep it to the fork ?
PR would be great but I didn’t follow:
lexical entries are split not only across the different lexfiles but also within the same file
Can you elaborate on that? I take lexical entries as word forms, the word lemas since wordnets usually do not deal with inflections.
I don't think we are against the idea of a LexicalEntry, just for the moment we model things this way (which is derived from PWN). I agree that we should merge these entries and have an attribute on the <Sense>
to indicate this information. I have made a pull request to the schemas project proposing this
https://github.com/globalwordnet/schemas/pull/9
@1313ou if you would be able to make a PR for this issue that would be a super.
Also, as a matter of housekeeping, could we close this issue and work on this as #180?
Shouldn't the 2 lexical entries below be merged into one lexical entry with 2 senses -same form -see cursor 'n' that spans the two of them The principle is that a lexical entry is not any subgrouping of senses and there should not be clones differring only by id name within the same lexfile.
This comes from an import of adjectives that is questionable: it considered the position suffix as part of the lemma, 'aware-p-' is derived from 'aware(p)'.
As there are lots of similar cases (about 1000) this can't be corrected by hand.
Correct:
with possible adjposition="p" attribute