Closed billy3321 closed 8 years ago
You've got two unescaped ampersands inside a query string on line 116. &
is a reserved character in XML, which means the ampersand literal needs to be rendered as &
. If you run across the same issue again in the future, try loading the offending file in a Python shell:
In [1]: import lxml.etree, urllib.request
In [2]: with urllib.request.urlopen('https://raw.githubusercontent.com/JRF-tw/nationwide_judicial_reform_meeting/master/2016-jrf/20160614.an') as file:
...: lxml.etree.fromstring(file.read())
...:
File "<string>", line unknown
XMLSyntaxError: EntityRef: expecting ';', line 116, column 158
@wfdd thank you for the prompt reply! however the error persists after changing the ampersand to a hexadecimal entity &
— see this link for the modified .an.xml
>>> import lxml.etree, urllib.request
>>>
>>> with urllib.request.urlopen('https://archive.tw/2016-06-14-%E5%85%A8%E6%B0%91%E5%8F%B8%E6%B3%95%E6%94%B9%E9%9D%A9%E9%81%8B%E5%8B%95-%E7%AC%AC%E4%BA%8C%E9%9A%8E%E6%AE%B5%E7%AC%AC%E4%B8%80%E6%AC%A1%E5%B7%A5%E4%BD%9C%E6%9C%83%E8%AD%B0%E6%9C%83%E8%AD%B0%E7%B4%80%E9%8C%84.an.xml') as file:
... lxml.etree.fromstring(file.read())
...
<Element akomaNtoso at 0x1038a21c8>
The tested mysociety instance name is another-test
with the URL above:
I tried importing the same .an.xml into a local docker instance running an older version. It initially said:
An exception of type IntegrityError occurred, arguments:
duplicate key value violates unique constraint "speeches_speaker_instance_id_425c08559ac70635_uniq"
DETAIL: Key (instance_id, slug)=(1, 高榮志) already exists.
After manually adjusting the ontology showAs
it worked locally, but not on the public instance.
Is that with a different file? In https://archive.tw/2016-06-14-全民司法改革運動-第二階段第一次工作會議會議紀錄.an.xml it's the href
that's changed.
That URL imports okay into my local default sayit-package instance, which is the version running on sayit.mysociety.org: So I'm just looking as to why it's giving an error on import on sayit.mysociety.org, it should be fine...
Sorry, sayit.mysociety.org had somehow had an incompatible version of the elasticsearch library installed at some point recently, which was causing the error upon trying to index the first speaker being imported.
I've fixed this now, and imported that file successfully into my own testing instance. It should be fine in your 'another-test' instance, I'm a bit worried the error happened at a point that your jrf
instance might give a different error upon import (similar to the IntegrityError you posted above)
Okay, I've delete the rogue unfinished Speaker object that was imported before the error raised, so hopefully all should be okay now. Please reopen if not, thanks for spotting and letting us know! :)
I try to import my meeting record to my sayit site, but it seems some error happened.
My site is at here: http://jrf.sayit.mysociety.org/
and my Akoma Ntoso is at here: https://raw.githubusercontent.com/JRF-tw/nationwide_judicial_reform_meeting/master/2016-jrf/20160614.an
Does my Akoma Ntoso format error? or there has some bug?