Closed matyaskopp closed 9 months ago
@TomazErjavec, should I do it and insert the fix to my tantra-home?
Yes please, and let me know when done and what I should do.
I have used data from
/project/corpora/Parla/ParlaMint/ParlaMint/Corpora/Sources-TEI/
and place the result here:
/home/kopp/ParlaMint-CZ-4.1/
the Czech folders can be overwritten in Source-TEI
Done! Will process it as soon as the q empties.
the current value is
url="2013ps/audio/2016/10/27/2016102714281442.mp3"
https://github.com/clarin-eric/ParlaMint/blob/cb93f7eb5002b6bd608600a6c800accfdce9c72b/Samples/ParlaMint-CZ/ParlaMint-CZ_2016-10-27-ps2013-050-07-005-262.xml#L59but it should be
url="audio/psp/2016/10/27/2016102714281442.mp3"
so the data from this record will be possible to use:
this script fixes it in ParCzech:
but I believe it is safe to use regex on XML,
s/url="[0-9]*ps\/audio\//url="audio\/psp\//
@TomazErjavec, should I do it and insert the fix to my tantra-home? or will you process it yourself?