bungeni-org / bungeni-exist

The eXist XML Db is used as a repository for XML archival documents by Bungeni.
1 stars 0 forks source link

Rich Text content fields need to be safely unescaped #1

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
There are some fields (e.g. body_text) whose content is escaped html.

For these fields to be safely transported over into eXist they need to be 
unescaped + "tidy"-ed -- since unescaping does not guarantee that the html is 
clean since its user entered content.

To do:

Add a post-processing step to the transformer to unescape and cleanup specified 
fields.

Original issue reported on code.google.com by ashok.ha...@gmail.com on 28 Oct 2011 at 1:39

GoogleCodeExporter commented 9 years ago
This issue has been fixed.

Added a <processgroup id=x> config to the pipeline configuration, this 
processgroup is used to define different actions which can take place before or 
after an xslt step.

added a process action to unescape html and clean-up to ensure valid xml

Original comment by ashok.ha...@gmail.com on 1 Nov 2011 at 2:32