typelevel / Laika

Site and E-book Generator and Customizable Text Markup Transformer for sbt, Scala and Scala.js
https://typelevel.org/Laika/
Apache License 2.0
414 stars 45 forks source link

Verbatim HTML: does not copy apostrophe and quotes verbatim in <script> #93

Closed hmf closed 5 years ago

hmf commented 5 years ago

I have set the .withRawContent flag to parse Markdown with inlined HTML. I am also using the Github flavor. I then add the following to a Markdown file I have:

<p align="center">
  <div id="myDiv"><!-- Plotly chart will be drawn inside this DIV --></div>
  <script>

    <!-- JAVASCRIPT CODE GOES HERE -->
        var trace1 = {
          x: [1, 2, 3, 4],
          y: [10, 15, 13, 17],
          mode: 'markers',
          type: 'scatter'
        };

        var trace2 = {
          x: [2, 3, 4, 5],
          y: [16, 5, 11, 9],
          mode: 'lines',
          type: 'scatter'
        };

        var trace3 = {
          x: [1, 2, 3, 4],
          y: [12, 9, 15, 12],
          mode: 'lines+markers',
          type: 'scatter'
        };

        var data = [trace1, trace2, trace3];

        Plotly.newPlot('myDiv', data);

  </script>
  <p style="text-align:center; font-weight: bold;">
    Figure 1: Embedded scatter plot example via iFrame 
  </p>
</p>

But after processing the apostrophe ' is changed and I get this in the HTML file:

<p align="center">
            <div id="myDiv"><!-- Plotly chart will be drawn inside this DIV --></div>
            <script>

                   <!-- JAVASCRIPT CODE GOES HERE -->
                       var trace1 = {
                         x: [1, 2, 3, 4],
                         y: [10, 15, 13, 17],
                         mode: &#39;markers&#39;,
                         type: &#39;scatter&#39;
                       };

                       var trace2 = {
                         x: [2, 3, 4, 5],
                         y: [16, 5, 11, 9],
                         mode: &#39;lines&#39;,
                         type: &#39;scatter&#39;
                       };

                       var trace3 = {
                         x: [1, 2, 3, 4],
                         y: [12, 9, 15, 12],
                         mode: &#39;lines+markers&#39;,
                         type: &#39;scatter&#39;
                       };

                       var data = [trace1, trace2, trace3];

                       Plotly.newPlot(&#39;myDiv&#39;, data);

                 </script>
                 <p style="text-align:center; font-weight: bold;">
                   Figure 1: Embedded scatter plot example via iFrame 
                 </p>

            </p>

I also tried quotes and that was also changed. Is this an error or do I have to do some type of escaping?

jenshalm commented 5 years ago

Yes, script tags are not really supported in verbatim HTML. Interestingly you are the first to notice that even though verbatim HTML is supported since 2013... :-)

Adding support will require enhancements to both, the parser and the renderer, and it's not very high on my list of priorities, so it's not going to happen that soon, but I'll keep the ticket open and will tackle it at some point.

I guess one reason I never thought of it is that Markdown has this weird bastardised form of embedded HTML where text nodes can also contain Markdown (and that Markdown can contain HTML again, and so on).

And another is that JavaScript embedded in Markdown feels somewhat unusual anyway. Is there a specific need for you to embed JavaScript in Markdown? Couldn't you just keep it in separate files?

hmf commented 5 years ago

@jenshalm thanks for looking into this.

I'll keep the ticket open and will tackle it at some point.

Ok.

I guess one reason I never thought of it is that Markdown has this weird bastardised form of embedded HTML where text nodes can also contain Markdown (and that Markdown can contain HTML again, and so on).

I see. This explains a lot. Wonder how many people actually take advantage of that.

Is there a specific need for you to embed JavaScript in Markdown?

I want to use MDoc and Laika to generate Jupyter like notebooks and present results of data analytics work based only on the JVM (thinking of calling it Satern 8-)). No need for Python, pip, etc, if I don't use Python libs for modeling. I also want to use this to document software with code snippet examples. This makes life much simpler for CI/CD - no need to install external software such as Ruby and Jekyll, no need to add these commands to the CI/CD script.

However, I need to show plots. I have 2 options: static images or dynamic plots. The dynamic option is more in line with Jupyter and Matplotlib's interactive backedend. After some experimentation I think plotly-scala is a better option because it basically has a Scala API for the very mature Plotly.js library (not many mature and well supported Scala plotting libraries exist).

Couldn't you just keep it in separate files?

I may be able to do this but in around about way. To explain. Here is the MDoc snippet that processes data and generated the plot (as Javascript):

'''scala mdoc:silent
    import plotly._, layout._, Plotly._

    val labels = Seq("Banana", "Banano", "Grapefruit")
    val valuesA = labels.map(_ => util.Random.nextGaussian())
    val valuesB = labels.map(_ => 0.5 + util.Random.nextGaussian())

    val traces = Seq(
      Bar(labels, valuesA, name = "A"),
      Bar(labels, valuesB, name = "B")
    )
    val playout = Layout()
    val jsSnippet = Plotly.jsSnippet("sample2", traces, playout)
'''

This is placed in the Markdown file that is preprocessed prior to Laika. In this case silent means that the code is placed in the Markdown file as a code snippet (code fence) example (nothing is added to the output).

Next we add the following to the Markdown file:

'''scala mdoc:passthrough

val snippet1 = s"""
<div style="width:100%; height:100%;">
    <div id="myDiv4"
         style="margin:0 auto; height:500px; width:900px; border:1px solid black;">
    </div>
    <script>
      $jsSnippet
    </script>
    <p style="text-align:center; font-weight: bold;">
        Figure 4: Embedded bar plot example via a Plotly-Scala generated script 
    </p>
</div>
"""

println(snippet1)
'''

This spits out the HTML snippet1 content into the Markdown document. We have in effect generated a dynamic plot via embedded HTML using Javascript .

So now I have to find a way to write the HTML snippet to a file and link to that in the Markdown source. I also have to make sure that the file is written to the correct target directory so that Laika picks it up during the next phase. I will now look at MDocs postmodifier to see if I can circumvent this issue.

In case you are curious here is the source for the above example (end of the file - for now).

Apologies for being so long winded but I think it explains the issue better.

jenshalm commented 5 years ago

Ah I see, the JS is generated, that explains it, thanks for describing your use case.

I think it makes sense to support script tags, even though you are the first on this planet who needs it. :-) If it would just require enhancing the HTML renderer it would be a trivial fix, maybe just one line, but sadly it also needs some tweaking of that messy parser for the weird HTML/Markdown mess, and that means it probably won't happen too soon.

If I find some time in the coming weeks I might have a look whether I can find a way to show you how to install support as an extension yourself, but no promises as I'm not sure it'll work (due to subtleties around parser precedence etc.)

hmf commented 5 years ago

No problem. Time permitting I will work on the MDoc postmodifier. If this works I will let you know and we can close this.

jenshalm commented 5 years ago

As suspected it's not possible to write a simple extension to support script tags. So I fixed it right in the core Markdown support which turned out to be easier than I thought.

As long as you are working with the 0.11 release, you would still need to try the workaround you thought of, as this fix is only on master.