gettalong / kramdown

kramdown is a fast, pure Ruby Markdown superset converter, using a strict syntax definition and supporting several common extensions.
http://kramdown.gettalong.org
Other
1.72k stars 275 forks source link

MD to HTML is adding spaces to code segments. #787

Closed MScalopez closed 1 year ago

MScalopez commented 1 year ago

The following issue was reported to us in a lab we deployed a few weeks ago: https://github.com/MicrosoftLearning/mslearn-fabric/issues/26. The user was complaining that the code segments were adding a leading space to each line of the code segments. When we looked at the Mark Down code, we do not see those extra spaces (look at the example below from https://raw.githubusercontent.com/MicrosoftLearning/mslearn-fabric/main/Instructions/Labs/02-analyze-spark.md).

1. With the notebook visible, expand the **Files** list and select the **orders** folder so that the CSV files are listed next to the notebook editor, like this:

    ![Screenshot of a notebook with a Files pane.](./Images/notebook-files.png)

2. In the **...** menu for **2019.csv**, select **Load data** > **Spark**. A new code cell containing the following code should be added to the notebook:

    ```python
    df = spark.read.format("csv").option("header","true").load("Files/orders/2019.csv")
    # df now is a Spark DataFrame containing CSV data from "Files/orders/2019.csv".
    display(df)
> **Tip**: You can hide the Lakehouse explorer panes on the left by using their **<<** icons. Doing so will help you focus on the notebook.

In that case the python code section is indented, but we have no 'extra' spaces. When I look at the HTML code of the GitHub page (https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/02-analyze-spark.html) this is what we get for those lines

```HTML
<ol>
  <li>
    <p>With the notebook visible, expand the <strong>Files</strong> list and select the <strong>orders</strong> folder so that the CSV files are listed next to the notebook editor, like this:</p>

    <p><img src="/mslearn-fabric/Instructions/Labs/Images/notebook-files.png" alt="Screenshot of a notebook with a Files pane." /></p>
  </li>
  <li>
    <p>In the <strong>…</strong> menu for <strong>2019.csv</strong>, select <strong>Load data</strong> &gt; <strong>Spark</strong>. A new code cell containing the following code should be added to the notebook:</p>

    <pre><code class="language-python"> df = spark.read.format("csv").option("header","true").load("Files/orders/2019.csv")
 # df now is a Spark DataFrame containing CSV data from "Files/orders/2019.csv".
 display(df)
</code></pre>

    <blockquote>
      <p><strong>Tip</strong>: You can hide the Lakehouse explorer panes on the left by using their <strong>«</strong> icons. Doing so will help you focus on the notebook.</p>
    </blockquote>
  </li>

Let's pay attention to the beginning of the HTML code, for the 'code section'

<pre><code class="language-python"> df = spark.read.format("csv")

Note how when the markdown was coverted to HTML it added a space after the "" and the "df = spark".

Notice below, that there is a space added at the beginning of each line in the code segment

image

To make it easier to see the issue, just select the copy button, open notepad, and paste it there, it will look like this (without the 0123456789).

image

You should see that when kramdown is converting Markdown to HTML, and you have a code segment that is indented, it will add an additional space at the beginning of every line in the code. Note that this did not happen in the past.

gettalong commented 1 year ago

I'm not quite sure what you are actually using to convert your Markdown document to HTML. It seems you are using kramdown with the GFM parser and HTML converter. If so, the output is expected since you don't use the correct amount of indentation:

1. This item uses *3* spaces for indentation
But the code block uses four spaces
of indentation
~~~
  1. Now with correct indentation

    But the code block uses four spaces
    of indentation

Output:

<ol>
  <li>
    <p>This item uses <em>3</em> spaces for indentation</p>

    <pre><code> But the code block uses four spaces
 of indentation
</code></pre>
  </li>
  <li>
    <p>Now with correct indentation</p>

    <pre><code>But the code block uses four spaces
of indentation
</code></pre>
  </li>
</ol>

Make sure you indent the code block the right amount of spaces.

MScalopez commented 1 year ago

Any chance you can point me to the documentation that indicates to use 3 spaces for indentations, so I can send it to my team?

gettalong commented 1 year ago

Sure, see the syntax documentation for lists. Generally, you need to indent to the column of the first non-whitespace character in the line after the list item marker. Since you are using 1. something, the 's' of 'something' is in the fourth column, so three spaces of indentation are needed.

MScalopez commented 1 year ago

Interesting, thank you for your help.