feat(markdown): md support, graph enabled script.

shotaronowhere commented 2 years ago

An initial limited markdown support with a new 'format' json field. When set to 'markdown', the question is styled, otherwise the default is plain-text.

For comparison, here is a question use case that relies on ipfs hosted documents, and therefore includes long content ids in the question. The markdown support is much better for users, more compact, with the long content id embedded as a link.

The markdown is converted to html, then sanitized for any improper tags which could potentially be malicious. The short whitelist of acceptable styling is listed here.

Just creating this PR to see if this is an approach you think is viable, but I haven't modified the rpc.js script and I can also change 1 or 2 aspects of the styling of embedded links to better distinguish links from plain text.

edmundedgar commented 2 years ago

This looks really good, I like the approach and the result looks really clean.

We should put some stuff to support format in reality-eth-lib: It should enumerate formats it recognizes (the current one which I guess we call text, and the new markdown, and throw a parse error if you feed it something we haven't built like html or MARKDOWN. I think it should probably also set the default format: text in the json it returns when a format isn't specified, so that code from now on can just read what the format is and doesn't need to worry about supplying the default if it's missing,

One thing we'll need to check (I haven't looked into this at all yet, this isn't a criticism of this implementation) is that the markdown implementation we're using is as standard, mature canonical as it can be. We can never quite do this perfectly but as far as possible we want to avoid cases where stuff gets into disputes because version X.2 of the markdown library thinks it's valid and version X.3 of the library thinks it doesn't parse, or whatever.

Relatedly, it would be best if we could also do this parsing in the Graph processing, which unfortunately only has its own typescript-like language as it can't import JavaScript. I'm not sure if that's possible (we don't currently have a perfect version of the sprintf.js library we're using, so there's already a bit of divergence, but I'm hoping to fix that one day) but it would be good to do it if we can, and the limitations on that may affect the choice of markdown parser / supported tags in JavaScript.

edmundedgar commented 2 years ago

This is looking good. Is there a reason we need to support multiple types of markdown? If possible I think it's better to restrict extraneous choices as they all create edge cases risk causing escalations over trivia.

shotaronowhere commented 2 years ago

One thing we'll need to check (I haven't looked into this at all yet, this isn't a criticism of this implementation) is that the markdown implementation we're using is as standard, mature canonical as it can be. ]

AFAIK, markdown was specified informally and many implementations diverged until a standardization effort, creating commonmark.

This PR supports the following formats:

'text/markdown' : Commonmark
'text/markdown-gfm' : Github flavored markdown (a superset of commonmark)

supports tables

'text/plain' : plain-text (default when format is not specified, so previous question display will not change unless format is specified)

Relatedly, it would be best if we could also do this parsing in the Graph processing, which unfortunately only has its own typescript-like language as it can't import JavaScript.

Hmm, if this is a must? The way it's written, currently reality questions with html code are forbidden. First any html, style, etc is removed. If there was any, then the markdown is rendered into html, stored as question_json['title-markdown-html']. on the dapp front-end, optionally images can be replaced with hyperlinks. So that potentially nsfw images

are simply rendered as links

and to sufficiently distinguish hyperlinks from bold words, the style from links is set to the color black instead of grey.

Alternatively, a 'simple' approach could be taken to only support hyperlinks (all I need specifically), which could be parsed in the graph, but identifying html and removing it completely from any question would be pretty cumbersome to parse in the graph. This is simply to ensure that and user generated question doesn't contain any malicious html code or styling which could impact the interpretation of other questions or the integrity of the front-end completely, so pre-determining dangerous questions with html would be difficult in the graph.

But if it's a must, you could do something specific for hyperlinks like parse the question title for the location of the "[text](url)" strings, and remove or replace them with some delim and create an object with each hyperlink in an html format, then any extraneous unsafe html would be rendered in plaintext, and only the hyperlinks would be rendered as html tags.

What is the need to do this kind of render parsing in the graph, optimization for mobile users? Maximal consistency?

shotaronowhere commented 2 years ago

This is looking good. Is there a reason we need to support multiple types of markdown? If possible I think it's better to restrict extraneous choices as they all create edge cases risk causing escalations over trivia.

yeah I can remove that, just added that since it is supported by the markdown package, there are very few extra features in the github-flavored-markdown, the only useful one is tables, here you can find all the extensions to commonmark labeled

shotaronowhere commented 2 years ago

removed the markdown-gfm

edmundedgar commented 2 years ago

Parsing this stuff in Graph isn't essential, since we can't do it currently because we don't have the sprintf.js support. It would be nice if we could do it though; The downside of not being able to is that you can't get a perfect sorted list of questions, because the graph implementation doesn't know which ones should be throwing parsing errors, so I was hoping to get sprintf support done one day and this adds another hurdle. So I think the best we can do here is to make the markdown implementation as simple and standard as it can reasonably be, which I guess your implementation is already aiming to do.

I guess a workaround would be to be less extreme when we handle parsing fails and render stuff that caused parse errors as plain text, but it's probably better to make failures obvious.

edmundedgar commented 2 years ago

I'd rather not support images unless failing to support them makes what we're doing very non-standard; There are two problems, firstly they make consistent rendering across devices complicated, and secondly people can play availability games with the source image. They can already do this with hyperlinked data, but I think having the actual content be an image makes that worse.

edmundedgar commented 2 years ago

On the github style, I don't have a strong opinion about which flavour of markdown we should support (as long as it's a standard) but I think we should pick one. If the only difference is tables, I'd suggest leaving the github style out as tables also have potential rendering complexities.

shotaronowhere commented 2 years ago

So the image display is suppressed in the frontend, but the img display is included in the html in the reality-eth-lib, the img can also be suppressed in the reality-eth-lib itself.

I can update the arbitrator evidence display as well. I looked into this today.

edmundedgar commented 2 years ago

OK great, let's suppress the img in reality-eth-lib.

Once that's done does it look ready to merge to you?

shotaronowhere commented 2 years ago

Okay added the img suppression in reality-eth-lib, LGTM.

just a comment, about the quesiton creation wizard, if/when the markdown format is supported in the wizard, the current reality arbitrators display reality questions as text in the kleros court, so I need to redeploy the arbitrator contract ($20 at 10 gwei gas). I can deploy the contracts this week, but I'll need to check on the bots, since I think a bot is needed to maintain the realitio arbitrator.

edmundedgar commented 2 years ago

just a comment, about the quesiton creation wizard, if/when the markdown format is supported in the wizard, the current reality arbitrators display reality questions as text in the kleros court, so I need to redeploy the arbitrator contract ($20 at 10 gwei gas). I can deploy the contracts this week, but I'll need to check on the bots, since I think a bot is needed to maintain the realitio arbitrator.

I think I'll leave it for custom templates for now and not add it to the built-in ones, so the front-end won't let you create questions with markdown. I'll think about adding that later, let's see how people use it with custom templates.

RealityETH / reality-eth-monorepo

feat(markdown): md support, graph enabled script. #92