simile-widgets / exhibit

Publishing Framework for Large-Scale Data-Rich Interactive Web Pages
MIT License
175 stars 94 forks source link

Allow for URL escaped {{ }} in lenses #165

Open Pike opened 8 years ago

Pike commented 8 years ago

I'm using django to create URLs for exhibit lenses, and that does a iri_to_url, which url-escapes the {{ }} that I passed in.

I'm currently working around that by decodeURI() on the value in the subcontent parser:

diff --git a/elmo/static/simile/exhibit/scripts/ui/lens.js b/elmo/static/simile/exhibit/scripts/ui/lens.js
index 54589df..8f34e7f 100644
--- a/elmo/static/simile/exhibit/scripts/ui/lens.js
+++ b/elmo/static/simile/exhibit/scripts/ui/lens.js
@@ -495,6 +495,8 @@ Exhibit.Lens._parseSubcontentAttribute = function(value) {
     var fragments, current, open, close;
     fragments = [];
     current = 0;
+    /* XXX Hack: django encodes IRIs, decode this here */
+    value = decodeURI(value);
     while (current < value.length && (open = value.indexOf("{{", current)) >= 0) {
         close = value.indexOf("}}", open);
         if (close < 0) {

@karger, do you have an opinion on whether this is good for exhibit to do or not?

karger commented 7 years ago

Sorry Pike; missed this notification. I'm afraid I don't have enough knowledge of "correct" html syntax to know whether this is the right thing or not. Is it technically bad to have {} in a url in an html document? What if we are trying to use exhibit to generate a url that contains {} after exhibit fills in subcontent? To do this right, do we need a way to tell exhibit to ignore certain {} in a url

Pike commented 7 years ago

Here's my path to it, I use django to generate the URL, which ends up in iri_to_uri, https://github.com/django/django/blob/92053acbb9160862c3e743a99ed8ccff8d4f8fd6/django/utils/encoding.py#L169, which points to https://tools.ietf.org/html/rfc3987#section-3.1, which then has the paragraph

  Systems accepting IRIs MAY also deal with the printable characters in
  US-ASCII that are not allowed in URIs, namely "<", ">", '"', space,
  "{", "}", "|", "\", "^", and "`", in step 2 above.  If these
  characters are found but are not converted, then the conversion
  SHOULD fail.  Please note that the number sign ("#"), the percent
  sign ("%"), and the square bracket characters ("[", "]") are not part
  of the above list and MUST NOT be converted.  Protocols and formats
  that have used earlier definitions of IRIs including these characters
  MAY require percent-encoding of these characters as a preprocessing
  step to extract the actual IRI from a given field.  This
  preprocessing MAY also be used by applications allowing the user to
  enter an IRI.

I really feel that we're battling one edgecase vs the other. Is that something one could make configurable?

Otherwise I might be stuck with either patching django or patching exihibit :-/