alphagov / tech-docs-gem

Gem to distribute the tech docs project
https://tdt-documentation.london.cloudapps.digital/
MIT License
15 stars 38 forks source link

Switch from Redcarpet to Kramdown #208

Closed AlanGabbianelli closed 3 years ago

AlanGabbianelli commented 3 years ago

Switch from Redcarpet to Kramdown

We recently received a report from Digital Accessibility Centre (a specialist accessibility testing organisation based in Wales). This report uncovered some accessibility issues that we need to fix.

One of these issues is that some tables in the content do not have column row headers when needed - the first column on tables with 3 or more columns does not have the <th> element. This fails WCAG 2.1 success criterion 1.3.1 (Info and Relationships). Changing the markdown parser from Redcarpet to Kramdown fixes this issue.

In a previous tech-docs-template commit, we had made the opposite change and moved from Kramdown to Redcarpet because Kramdown didn't support fenced code blocks. This isn't an issue now because we are using the GitHub-flavoured markdown (enforced by the line input: "GFM"), which does support fenced code blocks.

Trello card: https://trello.com/c/vv0OR35g/630-march-update-tech-docs-template

Co-authored-by: Michael S. Walker michael.s.walker@digital.cabinet-office.gov.uk Co-authored-by: Alan Gabbianelli alan.gabbianelli@digital.cabinet-office.gov.uk

36degrees commented 3 years ago

Great to see the accessibility issues in the tech docs being addressed 🙌🏻

It's worth noting that we've come full circle with this – we moved from Kramdown to Redcarpet in https://github.com/alphagov/tech-docs-template/commit/3c0947c8cbc3bdfdda2181cc6ea63535f23978a9. It looks like we made that change because Kramdown didn't support fenced code blocks – do we know if that's changed?

Knowing how different the various markdown engines can be, this feels like it could be quite a big change and needs to be carefully considered. Could we look at building a few different tech doc 'instances' and comparing the HTML output to give us confidence that this isn't going to break anything?

Finally, the commit message is somewhat vague – because I can access the linked spreadsheet, I'm guessing that this might be to do with column or row headers in tables, but if someone's looking back to these commits or this PR in the future and doesn't have access to the spreadsheet, it's really not clear what problem this is trying to solve, or why switching markdown engine solves it. Even with the added context of the spreadsheet, I don't really know how to give this a proper review, as I don't know what the problem is.

barrucadu commented 3 years ago

It looks like we made that change because Kramdown didn't support fenced code blocks – do we know if that's changed?

Yes, the input: "GFM" line makes it use github-flavoured markdown, which does support fenced code blocks.

AlanGabbianelli commented 3 years ago

Thank you for your review @36degrees ⭐

As you suggested, before merging, we should be more confident that we are not fixing something just to then realize we're breaking other things, so we won't merge this yet.

I addressed your last point and the commit message should now be clearer on what we're trying to achieve and why and, hopefully, give enough context to the next developers that will work on this.

richardTowers commented 3 years ago

I built the example in this repo from both branches, and pushed some tags including the build outputs.

You can see the diff in output with:

git diff build-outputs-master build-outputs-kramdown -- example/*.html

Output ```diff diff --git a/example/build/a-proxied-page.html b/example/build/a-proxied-page.html index 738cf7c..e6f744d 100644 --- a/example/build/a-proxied-page.html +++ b/example/build/a-proxied-page.html @@ -312,7 +312,9 @@
-

Proxied page

This is a proxied page.

+

Proxied page

+ +

This is a proxied page.

diff --git a/example/build/api-path.html b/example/build/api-path.html index 4eafec5..0fbd139 100644 --- a/example/build/api-path.html +++ b/example/build/api-path.html @@ -313,6 +313,8 @@

API /Pets

+ +

/pets

get

diff --git a/example/build/child-of-expired-page.html b/example/build/child-of-expired-page.html index 43463b1..75b0ed9 100644 --- a/example/build/child-of-expired-page.html +++ b/example/build/child-of-expired-page.html @@ -312,7 +312,9 @@
-

This is a child of expired page

Expired page should highlight in the navigation.

+

This is a child of expired page

+ +

Expired page should highlight in the navigation.

diff --git a/example/build/code.html b/example/build/code.html index a79eb09..9432024 100644 --- a/example/build/code.html +++ b/example/build/code.html @@ -312,8 +312,12 @@
-

Code examples

A paragraph with a code element within it.

+

Code examples

+ +

A paragraph with a code element within it.

+

code element within a link

+

An example of a table with a code element within it.

@@ -342,7 +346,9 @@
+

An example of a code block with a long line length

+
RSpec.describe ContentItem do
   subject { described_class.new(base_path) }
   let(:base_path) { "/search/news-and-communications" }
@@ -357,7 +363,9 @@
     end
   end
 end
-

An example of a code block with a short length

+
+

An example of a code block with a short length

+
RSpec.describe ContentItem do
 end
 
diff --git a/example/build/core-layout-without-sidebar.html b/example/build/core-layout-without-sidebar.html index 807af0b..bb20bd2 100644 --- a/example/build/core-layout-without-sidebar.html +++ b/example/build/core-layout-without-sidebar.html @@ -92,7 +92,9 @@
-

Core layout, no sidebar

This page does not have a sidebar.

+

Core layout, no sidebar

+ +

This page does not have a sidebar.

diff --git a/example/build/core-layout.html b/example/build/core-layout.html index 851005b..0250dff 100644 --- a/example/build/core-layout.html +++ b/example/build/core-layout.html @@ -123,7 +123,10 @@
-

Core layout, alternative sidebar

This page has a configurable sidebar that is independent of the content.

+ +

Core layout, alternative sidebar

+ +

This page has a configurable sidebar that is independent of the content.

diff --git a/example/build/expired-page-with-owner.html b/example/build/expired-page-with-owner.html index 24cbf77..b7b8f3e 100644 --- a/example/build/expired-page-with-owner.html +++ b/example/build/expired-page-with-owner.html @@ -312,7 +312,9 @@
-

This is an expired page with owner

See the banner on this page.

+

This is an expired page with owner

+ +

See the banner on this page.

diff --git a/example/build/expired-page.html b/example/build/expired-page.html index 93a9f06..947318a 100644 --- a/example/build/expired-page.html +++ b/example/build/expired-page.html @@ -312,7 +312,9 @@
-

This is an expired page

See the banner on this page.

+

This is an expired page

+ +

See the banner on this page.

diff --git a/example/build/headings.html b/example/build/headings.html index cdee2d8..adab8d2 100644 --- a/example/build/headings.html +++ b/example/build/headings.html @@ -311,7 +311,14 @@
-

The title

A subheader

A h3 subheader

Another subheader

+

The title

+ +

A subheader

+ +

A h3 subheader

+ +

Another subheader

+
diff --git a/example/build/hidden-page.html b/example/build/hidden-page.html index 404b633..d1d75ce 100644 --- a/example/build/hidden-page.html +++ b/example/build/hidden-page.html @@ -312,8 +312,10 @@
-

This is a hidden page

Sometimes you don’t want a page to show up in main nagivation, for example, -when you want to link to it from footer. Setting hide_in_navigation: true +

This is a hidden page

+ +

Sometimes you don't want a page to show up in main nagivation, for example,
+when you want to link to it from footer. Setting hide_in_navigation: true
frontmatter will prevent a link being generated in the navigation.

diff --git a/example/build/index.html b/example/build/index.html index c8aaf22..45eea16 100644 --- a/example/build/index.html +++ b/example/build/index.html @@ -314,10 +314,18 @@
-

Hello, World!

Edit Me!

Open source/documentation/index.md in your favourite text editor and start editing!

-

You can write content in Markdown using all of the usual syntax that you’re used to!

-

To change the title of the page or include additional files you’ll need to edit source/index.html.md.erb.

+

Hello, World!

+ +

Edit Me!

+ +

Open source/documentation/index.md in your favourite text editor and start editing!

+ +

You can write content in Markdown using all of the usual syntax that you're used to!

+ +

To change the title of the page or include additional files you'll need to edit source/index.html.md.erb.

+

If you want slightly more control, you can always use HTML.

+

For more detail and troubleshooting, take a look at the README.md file in the root of this project.

@@ -328,6 +336,7 @@
+
diff --git a/example/build/nested-page/another-nested-nested-page/index.html b/example/build/nested-page/another-nested-nested-page/index.html index 1e8c6f5..5f4abfc 100644 --- a/example/build/nested-page/another-nested-nested-page/index.html +++ b/example/build/nested-page/another-nested-nested-page/index.html @@ -313,6 +313,7 @@

Another nested nested page

+
diff --git a/example/build/nested-page/another-nested-page/index.html b/example/build/nested-page/another-nested-page/index.html index 4b1e0fe..270eef4 100644 --- a/example/build/nested-page/another-nested-page/index.html +++ b/example/build/nested-page/another-nested-page/index.html @@ -313,6 +313,7 @@

Another nested page

+
diff --git a/example/build/nested-page/index.html b/example/build/nested-page/index.html index 5e67445..e9a88e3 100644 --- a/example/build/nested-page/index.html +++ b/example/build/nested-page/index.html @@ -313,6 +313,7 @@

Nested page

+
diff --git a/example/build/not-expired-page.html b/example/build/not-expired-page.html index fb70dae..15ea8bd 100644 --- a/example/build/not-expired-page.html +++ b/example/build/not-expired-page.html @@ -312,7 +312,9 @@
-

This is not an expired page

See the banner on this page.

+

This is not an expired page

+ +

See the banner on this page.

diff --git a/example/build/prevent-index-page.html b/example/build/prevent-index-page.html index 4380335..4b719c3 100644 --- a/example/build/prevent-index-page.html +++ b/example/build/prevent-index-page.html @@ -313,8 +313,10 @@
-

Un-indexed page

This page should not be indexed by search engines, because it contains a -<meta name="robots" content="noindex"> tag.

+

Un-indexed page

+ +

This page should not be indexed by search engines, because it contains a
+<meta name="robots" content="noindex"> tag.

```

Looks like it's whitespace only.

The example isn't that complicated though. I'll have a look at a more realistic example.

richardTowers commented 3 years ago

Tried the same thing on https://github.com/alphagov/paas-tech-docs, which is a more complicated example.

You can see the changes in this commit: https://github.com/alphagov/paas-tech-docs/commit/3775795f4b6384034be443341b8ae9a9bc124b44

There are a few bits in there which look like they might be breaking changes. For example links to URL fragments change a bit (#deploy-a-node-js-app becomes #deploy-a-nodejs-app), which might break incoming links.

I think that would be okay, so long as this is a major version upgrade and we're clear in the release notes what's changing.

36degrees commented 3 years ago

@richardTowers are you able to share the steps you used to test the changes in the PaaS tech docs? Happy to test them in our tech docs as well.

The only other change that jumps out at me is the fact that the text within list items is now wrapped in paragraphs. I don't think it's an issue, per se – it's just a little odd.

richardTowers commented 3 years ago

It was something like:

36degrees commented 3 years ago

I've done a comparison using the GOV.UK Frontend Docs repo.

I've stuck a diff of the changes between the two in a gist here: https://gist.github.com/36degrees/59edb6db381828010b7c491539ffcda3

Text in list items are wrapped in paragraphs

As mentioned in my previous comment, list items are now consistently wrapped in paragraphs:

For example:

<ol>
<li>Install GOV.UK Frontend.</li>
<li>Add the HTML for a component to your application.</li>
<li>Get the CSS working.</li>
<li>Get the font and images working.</li>
<li>Get the JavaScript working.</li>
</ol>

is now:

<ol>
  <li><p>Install GOV.UK Frontend.</p>
  </li>
  <li><p>Add the HTML for a component to your application.</p>
  </li>
  <li><p>Get the CSS working.</p>
  </li>
  <li><p>Get the font and images working.</p>
  </li>
  <li><p>Get the JavaScript working.</p>
  </li>
</ol>

With the current stylesheet, this means that there is more spacing between list items as paragraphs have the user-agent default margins applied to them.

Before:

Screenshot 2021-03-08 at 10 54 30

After:

Screenshot 2021-03-08 at 10 54 35

Hard-wrapped paragraphs are rendered with line breaks where they are wrapped.

Single line breaks within paragraphs are no longer being ignored.

This means that where paragraphs have been hard-wrapped (for example, to wrap consistently at 80 characters), those line breaks appear in the HTML as <br>s, affecting the way the text is displayed in the rendered HTML.

For example:

You will need to to include an [HTML5 shiv](https://github.com/aFarkas/html5shiv)
which allows the 'semantic' HTML elements introduced in HTML5 to be used in
older browsers which do not natively support them.

These elements include `article`, `aside`, `figcaption`, `figure`, `footer`,
`header`, `main`, `mark`, `nav`, `section`, and `time`.

To improve performance for users of modern browsers, you can wrap the shiv
include with conditional comments that target only the browsers that need it:

```html
<!--[if lt IE 9]>
  <script src="/path/tohtml5shiv.js"></script>
<![endif]-->

Note that some libraries such as Modernizr may already include html5shiv.


used to render like this:

![Screenshot 2021-03-08 at 10 54 07](https://user-images.githubusercontent.com/121939/110327681-2eed4380-8012-11eb-9c89-526e0010ac68.png)

but now renders like this:

<img width="792" alt="Screenshot 2021-03-08 at 10 54 15" src="https://user-images.githubusercontent.com/121939/110327699-34e32480-8012-11eb-94da-f7769f0871c3.png">

### Smart quotes

Smart quotes (straight quotes like `''` or `""` being automatically converted to their typographic equivalents `‘’` and `“”`) no longer seem to work.

Before:

Use ‘compatibility mode’ if you want to use GOV.UK Frontend components and an old framework together in a service.


After:

Use 'compatibility mode' if you want to use GOV.UK Frontend components and an old framework together in a service.

36degrees commented 3 years ago

The IDs on headings that appear multiple times are generated in a completely different way.

For example, given a heading structure like this:

## govuk-image-url-function
### default value

## govuk-font-url-function
### default value

The IDs for the two default value headings were previously generated by combining them with their parents, like govuk-image-url-function-default-value and govuk-font-url-function-default-value.

They're now generated using an incremental counter, so they might be e.g. default-value-3 and default-value-4 instead.

Aside from breaking existing anchors, this makes it a lot less likely that IDs will remain associated with the correct heading as the page changes. If, for example, a new entry was added between govuk-image-url-function govuk-font-url-function with another default value heading, the IDs would be remapped and default-value-4 would end up pointing to the default value heading within the new entry.

Given the tech docs template is often used for API documentation, where this sort of heading structure is quite common, I think this probably needs addressing.

richardTowers commented 3 years ago

Hmm, quite a few annoying edge cases. I wonder if it's worth revisiting fixing the accessibility issues in Redcarpet instead of trying to make kramdown behave like Redcarpet does.

I don't think kramdown fixes the other accessibility issue (scrollable code snippets aren't keyboard accessible).

Is the only thing this major change buys us is changing cells in table headings from <td>s to <th>s?

richardTowers commented 3 years ago

I found this commit in redcarpet from like 8 days ago 🤔

https://github.com/vmg/redcarpet/commit/27dfb2a738a23aadd286ac9e7ecd61c4545d29de

richardTowers commented 3 years ago

Here's a (hacky) approach to implementing row-level headings without switching the markdown processor: https://github.com/alphagov/tech-docs-gem/pull/214

richardTowers commented 3 years ago

This looks like it would have quite a high chance of introducing breaking changes. Particularly the things Ollie spotted around hard-wrapped paragraphs and IDs on headings.

I think we've managed to fix the accessibility issues that motivated this in the first place in #214 and #215, so let's close this.

richardTowers commented 3 years ago

In updating the govuk-developer-docs, I noticed that we already override the markdown processor to kramdown, and that there's a hard_wrap: false setting in the config. So that would fix one of the issues that prevented us from merging this.

The heading IDs issue still seems like a roadblock, but it might be possible to work around that.