vivo-project / VIVO

VIVO is an extensible semantic web application for research discovery and showcasing scholarly work
http://vivoweb.org
BSD 3-Clause "New" or "Revised" License
202 stars 127 forks source link

Some freemarker templates don't have escaped values #3869

Closed litvinovg closed 1 year ago

litvinovg commented 1 year ago

Describe the bug If data property contain special characters, like " or html tags

, then some times template gets broken

To Reproduce Steps to reproduce the behavior:

  1. Create a person, edit label, use </div> in the label
  2. View individual's profile, see broken html
  3. Add primary email, that contain "</div> character
  4. Open page to edit this data property, see broken html
  5. Open page to delete this data property, see broken html

Expected behavior Data property text should be displayed as as string value, not interfere with surrounding html.

brianjlowe commented 1 year ago

While this is indeed a problem, it is not traditionally the expected behavior that all HTML should be escaped in data property values. Traditionally there has been an expectation that certain tags like p, ul, ol, li and the heading levels can be used in properties like overview statement, teaching statement, etc. Labels are used for publication titles, and here it is often the case that ≤i≥ is used for scientific names and sub/sup are used for subscript and superscript.

In the GUI editor, AntiSamy is used to strip unwanted tags and scripts, and the client-side validation in TinyMCE is used as well. It sounds like something needs to be changed here for the label form. And of course none of guards against ingested values.

Before making a sweeping change, it probably is worth discussing whether the current anti-XSS library or something similar can also be used at display time without an objectionable performance penalty. In addition, it would be good to have specific annotations for properties where HTML (or different HTML subsets) is or isn't allowed.