ufal / ParCzech

ParCzech is a project on compiling Czech parliamentary data into annotated corpora.
https://ufal.mff.cuni.cz/parczech
0 stars 1 forks source link

notes are not allowed in names - validity issue #109

Closed matyaskopp closed 3 years ago

matyaskopp commented 3 years ago

this is a correct note: https://www.psp.cz/eknih/2017ps/stenprot/079schuz/s079316.htm This word has not been said: image

but it is named entity(without note): image

 22092                       <name ana="ne:or"
 22093                             xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.ne130"
 22094                             type="MISC">
 22095                          <w xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.u11.p15.s1.w17"
 22096                             lemma="strategie"
 22097                             msd="UposTag=NOUN|Case=Gen|Gender=Fem|Number=Sing|Polarity=Pos">Strategie</w>
 22098                          <w xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.u11.p15.s1.w18"
 22099                             lemma="vzdělávání"
 22100                             msd="UposTag=NOUN|Case=Gen|Gender=Neut|Number=Sing|Polarity=Pos">vzdělávání</w>
 22101                          <note type="comment">(vzdělávací)</note>
 22102                          <w xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.u11.p15.s1.w19"
 22103                             lemma="politika"
 22104                             msd="UposTag=NOUN|Case=Gen|Gender=Fem|Number=Sing|Polarity=Pos">politiky</w>
 22105                          <name ana="ne:gc"
 22106                                xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.ne131">
 22107                             <w xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.u11.p15.s1.w20"
 22108                                lemma="český"
 22109                                msd="UposTag=ADJ|Case=Gen|Degree=Pos|Gender=Fem|Number=Sing|Polarity=Pos">České</w>
 22110                             <w xml:id="ParlaMint-CZ_2021-01-22-ps2017-079-03-007-352.u11.p15.s1.w21"
 22111                                lemma="republika"
 22112                                msd="UposTag=NOUN|Case=Gen|Gender=Fem|Number=Sing|Polarity=Pos">republiky</w>
 22113                          </name>
 22114                       </name>
matyaskopp commented 3 years ago

ParlaMint schema changed - https://github.com/clarin-eric/ParlaMint/issues/56