scriptotek / php-marc

Simple interface for working with MARC records using the File_MARC package
MIT License
52 stars 11 forks source link

Encoding error while reading Marc record #11

Closed numito closed 6 years ago

numito commented 6 years ago

XmlImporter will trigger an error while reading xml depending on the version of Alma. Alma might return an XML document in the wrong encoding, encoding in XML file is said to be UTF-16 but actually it is UTF-8. A solution is to catch this error, then change the encoding in the XML using a preg_replace from UTF-16 to UTF-8 and then parse the XML file.

if ($error[0]->code === 81) { //'Document labelled UTF-16 but has UTF-8 content' $data = preg_replace('/(<\?xml[^?]+?)utf-16/i', '$1utf-8', $data); $this->source = simplexml_load_string($data, 'SimpleXMLElement', 0, $ns, $isPrefix); }

See attached XMLImporter file XmlImporter.php.zip

danmichaelo commented 6 years ago

It's a bit risky to ignore errors like this, so I don't think such a workaround should be part of php-marc, but rather the Alma client library. I'm closing this and keeping https://github.com/scriptotek/php-alma-client/issues/9 :)