TryGhost / migrate

MIT License
42 stars 19 forks source link

Error while migrating code block #806

Closed MasGaNo closed 1 year ago

MasGaNo commented 1 year ago

Hello the team,

I'm facing an error while using the migrate tools to convert medium.com posts to Ghost posts:

migrate medium --pathToZip medium-export-test.zip

Medium.com code block:

image

Actual Ghost.io code block:

image

Expected Ghost.io code block:

image

Medium.com exported HTML code:

<pre name="4a0a" id="4a0a" class="graf graf--pre graf-after--p">
  <code class="markup--code markup--pre-code">
    sudo apt-get update<br>sudo apt-get install \<br>    apt-transport-https \<br>    ca-certificates \<br>    curl \<br>    gnupg-agent \<br>    software-properties-common \<br>    certbot \<br>    python3-certbot-nginx
  </code>
</pre>
<pre name="ebeb" id="ebeb" class="graf graf--pre graf-after--pre">
  <code class="markup--code markup--pre-code">
    sudo apt-key adv --keyserver keys.gnupg.net --recv-keys 0xDE8B853FC155581D
  </code>
</pre>
<pre name="9a1f" id="9a1f" class="graf graf--pre graf-after--pre">
  <code class="markup--code markup--pre-code">
    echo &quot;deb https://download.passbolt.com/ce/debian buster stable&quot; |\ <br>sudo tee /etc/apt/sources.list.d/passbolt.list<br>sudo apt-get update<br>sudo apt-get install passbolt-ce-server
  </code>
</pre>

Ghost.io mobiledoc:

[
  "code",
  {
    "code": "sudo apt-get updatesudo apt-get install \\    apt-transport-https \\    ca-certificates \\    curl \\    gnupg-agent \\    software-properties-common \\    certbot \\    python3-certbot-nginx"
  }
],
[
  "code",
  {
    "code": "sudo apt-key adv --keyserver keys.gnupg.net --recv-keys 0xDE8B853FC155581D"
  }
],
[
  "code",
  {
    "code": "echo \"deb https://download.passbolt.com/ce/debian buster stable\" |\\ sudo tee /etc/apt/sources.list.d/passbolt.listsudo apt-get updatesudo apt-get install passbolt-ce-server"
  }
]

As you can see, all <br> tag were stripped during the process of migration.

It may happen somewhere in the convertPost function which uses @tryghost/html-to-mobiledoc under the hood, and itself is using @tryghost/kg-parser-plugins to parse the HTML by removing all br tag as the allowBr option is not correctly apply.

I'm using:

% migrate --version
0.31.1

Thank you.

MasGaNo commented 1 year ago

Another example:

Medium.com:

image

Ghost.io:

image

Medium.com HTML:

<p name="4722" id="4722" class="graf graf--p graf-after--h3">
  Well, now that’s out of the way, let’s get started! The first step is to add our <a href="https://github.com/..." data-href="https://github.com/..." class="markup--anchor markup--p-anchor" rel="noopener" target="_blank">repo for the Helm chart</a>:
</p>
<pre data-code-block-mode="2" spellcheck="false" data-code-block-lang="bash" name="2296" id="2296" class="graf graf--pre graf-after--p graf--preV2">
  <span class="pre--content">helm repo add passbolt-repo https://download.passbolt.com/charts/passbolt</span>
</pre>

Ghost.io mobiledoc:

[
  1,
  "p",
  [
    [
      0,
      [],
      0,
      "Well, now that’s out of the way, let’s get started! The first step is to add our "
    ],
    [
      0,
      [6],
      1,
      "repo for the Helm chart"
    ],
    [
      0,
      [],
      0,
      ":helm repo add passbolt-repo https://download.passbolt.com/charts/passbolt"
    ]
  ]
]

Here the whole <pre> tag was stripped and merge with the previous p tag

PaulAdamDavis commented 1 year ago

Hi @MasGaNo, thanks for reporting this and providing the source HTML! I'm working on a fix and will update this issue when that's done.

PaulAdamDavis commented 1 year ago

This should now be fixed and released in @tryghost/migrate@0.32.0. Thanks again for reporting!