digitalroastery / weblounge

Web Content Management System
12 stars 4 forks source link

Creator id not parsed for XML entities #371

Open JamesUoM opened 8 years ago

JamesUoM commented 8 years ago

A creator of with a name "foo & bar" that is subsequently used as an user id will any code this tries to parse the XML. Entity parsing required for attribute values

https://github.com/entwinemedia/weblounge/blob/develop/modules/weblounge-common/src/main/java/ch/entwine/weblounge/common/impl/content/CreationContext.java#L255

Why are you manually compiling XML anyway?

myniva commented 8 years ago

I can't exactly understand what is not working for you. Would you mind posting some more information (e.g. code snippet, exception, stack trace). Thank you, James!

JamesUoM commented 8 years ago

Hi,

It's an interaction between VL and the underlying WL code.

When a mediapackage is ingested that has a dc:creator, VL creates user whose loginid is the value of dc:creator. The code I commented on assumes that the user loginid is valid as an XML attribute. OK maybe VL could validate the value of dc:creator, but it shouldn't have to know that the value is going to be used as unparsed XML.

The generated XML then gets re-parsed(!?) at which point it fails if the dc:creator value contained one of "<>?"

Create an episode, put "&" without quotes in the presenter field, Video Lounge fails to harvest with the following error: 2015-07-21 16:10:00 WARN (AbstractResourceReaderImpl:480) - Fatal error while reading manchester-videolounge:/e5ee1d64-16b5-49c6-9ca9-059bf6d59cc3/: The entity name must immediately follow the '&' in the entity reference. 2015-07-21 16:10:00 WARN (MovieResourceSerializer:146) - Error parsing audio visual resource from metadata org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source) at javax.xml.parsers.SAXParser.parse(Unknown Source) at ch.entwine.weblounge.common.impl.content.AbstractResourceReaderImpl.read(AbstractResourceReaderImpl.java:128) at ch.entwine.weblounge.contentrepository.impl.MovieResourceSerializer.toResource(MovieResourceSerializer.java:143) at ch.entwine.weblounge.search.impl.SearchIndexImpl.updateVersions(SearchIndexImpl.java:522) at ch.entwine.weblounge.search.impl.SearchIndexImpl.addToIndex(SearchIndexImpl.java:495) at ch.entwine.weblounge.search.impl.SearchIndexImpl.add(SearchIndexImpl.java:439) at ch.entwine.weblounge.contentrepository.impl.index.ContentRepositoryIndex.add(ContentRepositoryIndex.java:192) at ch.entwine.weblounge.contentrepository.impl.AbstractWritableContentRepository.put(AbstractWritableContentRepository.java:795) at ch.entwine.weblounge.contentrepository.impl.operation.PutOperationImpl.run(PutOperationImpl.java:118) at ch.entwine.weblounge.contentrepository.impl.operation.PutOperationImpl.run(PutOperationImpl.java:37) at ch.entwine.weblounge.contentrepository.impl.operation.AbstractContentRepositoryOperation.execute(AbstractContentRepositoryOperation.java:130) at ch.entwine.weblounge.contentrepository.impl.AbstractWritableContentRepository$OperationProcessor$1.run(AbstractWritableContentRepository.java:1354) at java.lang.Thread.run(Thread.java:745)

On 01/02/16 15:28, Basil Brunner wrote:

I can't exactly understand what is not working for you. Would you mind posting some more information (e.g. code snippet, exception, stack trace). Thank you, James!

— Reply to this email directly or view it on GitHub https://github.com/entwinemedia/weblounge/issues/371#issuecomment-178020616.


James S. Perrin

Media Technologies Team J20, Sackville Building The University of Manchester Oxford Road, Manchester, M13 9PL

t: +44 (0) 161 275 6945

e: james.perrin@manchester.ac.uk

"The test of intellect is the refusal to belabour the obvious"

- Alfred Bester