clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

ES-CT: invalid character #597

Closed matyaskopp closed 1 year ago

matyaskopp commented 1 year ago

@rjzevallos

https://github.com/clarin-eric/ParlaMint/actions/runs/4027956603/jobs/6924304418#step:4:135

ERROR: File ParlaMint-ES-CT_2018-01-17-0101.xml contains bad chars: U+A0 (1x)
ERROR: File ParlaMint-ES-CT_2018-01-17-0101.ana.xml contains bad chars: U+A0 (1x)

documentation: https://clarin-eric.github.io/ParlaMint/#sec-chars

TomazErjavec commented 1 year ago

This obviously won't be fixed - in fact, ES-CT has many more bad chars than just these and here. Will close, if we want to, we could make "future" issues for all corpora with bad chars, but we probably can't be bothered.