Closed xristy closed 7 years ago
Only 3 outlines seem to suffer from this problem, here are the fixed versions of these: fixed-outlines.zip can you quickly check them before I (or you) upload them to exist?
I figured it was easier to fix them directly than having tedious code to handle all the various cases like the spelling mistakes and cases where the description has a type like
<outline:description type="location">lha sa number 1</outline:description>
Note that I did not fix puzzling things like
<outline:description>snar thang number</outline:description>
(with no number), that's garbage data, but well, at least the interesting data will get transferred correctly...
Excellent! I'll upload these three. Thanks so much.
Corrections uploaded
O5JW1143
also has malformed descriptions - it was a clone of O4JW333
O5JW1143.xml.zip here's the fixed version
A few others I just spotted:
The fixed outlines, can you upload them?
I've uploaded these.
I noticed that in a number of cases there were sde dge elements but no numbers which I assume will simply be filtered out via xml2ld
I also noted some section names in O1PD181215 that have some sort of invalid character next to a chinese character
Thanks for the upload! For the ignore:
<description type="sde dge number"></description>
is ignored, but
<description>sde dge number</description>
is not (hence part of the changes I made to these outlines)... I didn't really look for invalid characters, do you have a node ID in which it appears?
O1PD1812154CZ135987, O1PD1812154CZ136047, O1PD1812154CZ136352, etc The section names directly under the skabs gsum pa/ rgyu mtshan nyid theg pa'i skor/ (ka-go)
section
indeed, it's also an u+fffd
as in the other encoding problems...
In several outlines the is represented in the content of the element instead of the @type, for example in O4JW333:
this also illustrates duplication of the
sde dge number
.