clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

FR: Strange transcriber notes #807

Open matyaskopp opened 11 months ago

matyaskopp commented 11 months ago
grep -ro '<note.*>[^>]*_[^>]*</note>' .|sed 's/ xml:id=".*"//'|head -n 20
./2017/ParlaMint-FR_2017-07-27-E1024.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-07-27-E1024.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-10-21-O1020.xml:<note type="debate">APPEL_PLF_1_20</note>
./2017/ParlaMint-FR_2017-10-21-O1020.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-09-26-E2004.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-09-26-E2004.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-10-20-O1019.xml:<note type="debate">APPEL_PLF_1_20</note>
./2017/ParlaMint-FR_2017-10-20-O1019.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-10-26-O1029.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-10-26-O1029.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-10-27-O1032.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-10-27-O1032.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-11-09-O1047.xml:<note type="debate">APPEL_PLF_1_20</note>
./2017/ParlaMint-FR_2017-11-09-O1047.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-12-15-O1100.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-12-15-O1100.xml:<note type="debate">MAV_PLF1_1_10</note>
./2017/ParlaMint-FR_2017-12-15-O1100.xml:<note type="debate">PLF_PARTIES_1_20</note>
./2017/ParlaMint-FR_2017-12-15-O1100.xml:<note type="debate">FIN_SEAN_1_2</note>
./2017/ParlaMint-FR_2017-12-04-O1082.xml:<note type="debate">TITRE_TEXTE_DISCUSSION</note>
./2017/ParlaMint-FR_2017-12-04-O1082.xml:<note type="debate">FIN_SEAN_1_2</note>

these notes are not present in transcriptions they are values of //point/@code_grammaire in XML source: eg: https://www.assemblee-nationale.fr/dyn/opendata/CRSANR5L15S2022O1N151.xml

<point nivpoint="2" valeur_ptsodj="1" ordinal_prise="1" 
       id_preparation="2052524" ordre_absolu_seance="8" 
       code_grammaire="PRESENTATION_1_0" 
       code_style="Sous-tit_p_cap" code_parole="" sommaire="1" id_syceron="2783770" valeur="">
<-- ... -->
</point>