nteract / vdom

🎄 Virtual DOM for Python
https://github.com/nteract/vdom/blob/master/docs/mimetype-spec.md
BSD 3-Clause "New" or "Revised" License
221 stars 34 forks source link

Use markupsafe for escaping? #107

Open aarondewindt opened 3 years ago

aarondewindt commented 3 years ago

Because of issue #106 I forked this project and changing it so it uses markupsafe for escaping. This not only allows marking safe text, but it also deals with any object with an __html__ function. This function is used by a some projects out there instead of _repr_html_. markupsafe can also deal with the python2/3 string type shenanigans, so it'll help with issue #63. Although I think IPython will still be needed for the event stuff.

The issue is that I don't think I can make these changes completely backwards incompatible. At this point I think I only have two options, neither of which are backwards compatible.

1. Escape text when rendering (in _reprhtml)

The issue here is that the DOM will have both safe and unsafe text, which is not an issue as long as they stay as python objects. The problem is when they are converted to JSON. The safe marking will be lost, since both string types will be written as is into json. One way to solve this is by escaping all text in to_dict() and then assume all text handled in from_dict() is already escaped. However the current implementation assumes the text passed to from_dict() is unescaped.

2. Escape text and objects while initializing a VDOM instance

The advantage is that I can then always assume that all text in the DOM is already escaped. It also makes it possible to safely handle objects implementing __html__ (and possibly _repr_html_), since they will be evaluated at that instance instead of at a later moment when the object could have changed. The issue is again with the dictionary and JSON serialization. The current implementation of from_dict() assumes the text is unescaped.

Plan

I would like to take the second approach. The reason is so it's able to reliably handle any object implementing __html__ and _repr_html_.

Any opinions?

aarondewindt commented 3 years ago

Ok, this is more complicated than I thought. From what I can see Jupyter uses the dictionary to render the VDOM objects instead of the result from .to_html(). It does this by using the @nteract/transform-vdom npm package, whose repository link is broken. My guess is, this project (vdom) is dead by now?

So in order to apply my changes I would have the modify @nteract/transform-vdom so it assumes strings are already escaped or expand the vdom spec to have a marker for safe text.

For now I'll tread my fork as a separate project and remove 'application/vdom.v1+json' from _repr_mimebundle_, which I presume breaks all event related features.

rgbkrk commented 3 years ago

@nteract/transform-vdom is now inside the outputs repository (not part of the nteract/nteract monorepo): https://github.com/nteract/outputs/tree/master/packages/transform-vdom

I've stepped away from these repos for a bit so you'll have to bear with me as I'm diving back in again.