alphagov / tech-docs-gem

Gem to distribute the tech docs project
https://tdt-documentation.london.cloudapps.digital/
MIT License
15 stars 38 forks source link

Accept last_reviewed_on field as a string input #302

Open NickWalt01 opened 2 years ago

NickWalt01 commented 2 years ago

What should change

Accept the use of last_reviewed_on field as a string input in any string variation; 2021/02/23, March 2027 23, January 13 2020, rather than a Date data type only. Whilst updating dependencies in the repositories listed in the next section I have found exceptions regarding the last_reviewed_on field as shown here. It appears that middleman expects the Date to be 2021/02/23 as highlighted on this page but the embedded ruby in this repository expects it to be 2021-02-23 as a Date object. As this field causes an exception it makes the title field fail causing the real error to be hidden by middleman asking for H1 tags when <%= current_page.data.title %> is used in the html file.

It currently fails here and the fix is to accept the string that is converted to a Date object:

def format_string_date(string_date)
   Date.parse(string_date)
end

User need

At MoJ we are using the tech-docs-gem in this repository to publish our template documents based on this template repository. We have many html pages that are utilising the tech-docs-gem to generate the html sites.

I have modified the tech-docs-gem ruby code and tested it works with both a string and a Date object.

NickWalt01 commented 2 years ago

Hi @laurahiles any idea how long it will take to implement the change? I can share the code and test if it helps

richardTowers commented 1 year ago

Thanks for raising this Nick. I've done a bit of digging, and to summarise the issue as I understand it:

I have an untested hypothesis that this is fixed in middleman 4.4.3, because they switched to ::YAML.safe_load(..., permitted_classes: [Date, ...], ...) which should mean formats that parse as dates are allowed now (e.g. 1970-01-01).

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

richardTowers commented 1 year ago

@LeePorte reports that the issue is still present with middleman 4.4.3 however.

NickWalt01 commented 1 year ago

Hello @richardTowers thanks for having a look at this.

Your summary make perfect sense, I do remember the load_file to unsafe_load_file change causing an issue here.

The repo is using Middleman 4.4.2 at the moment, Gemfile.lock, so if it does get fixed in that dependency I can bump the version but we do have some other dependency issues in the Gemfile. The repo is using govuk_tech_docs (3.2.1). Due to these issues we paused updating the repo.

I do not have my branch with the fix anymore. Is it likely a fix will be applied?

richardTowers commented 1 year ago

It's at least on our radar now, I can't promise we'll get to it immediately though.

richardTowers commented 1 year ago

I've raised https://github.com/middleman/middleman/issues/2614 with middleman. We should still consider supporting strings, irrespective of whether they decide to fix this.

lfdebrux commented 1 year ago

Ah, so has this broken building docs completely now?

lfdebrux commented 1 year ago

You could still make a strong argument that this gem should support both strings and dates, since middleman's docs clearly suggest that weird stringy date formats are permitted.

Could you point me towards where this is documented?

NickWalt01 commented 1 year ago

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

lfdebrux commented 1 year ago

Hi @lfdebrux my notes on the topic are here. I am not sure where the documentation is for Middleman. There is this page that states: "If you want, you can specify a full date and time as a date entry in the front matter, to help with ordering multiple posts from the same day. " and has the date as date: 2011/10/18

Cool, thanks for the added context. Apologies, when I first looked at this ticket I thought it was solely a feature request, I didn't realise that the issue was preventing the docs from building the site even if the date was formatted correctly.

My main concern in allowing a string for the last_reviewed_on field is that it might mean we get the whole issue with ambiguously written dates; Ruby's Date.parse seems a little too flexible [1], it's not clear to me whether the result of parsing say 04/02/2020 is guaranteed to always be 4th February 2020, or if it is locale dependent. I also noticed that it says that 'if string does not specify a valid date, the result is unpredictable', which is worrying!

I think it might be safer to use Date.strptime(string, '%Y-%m-%d') instead, restricting the kinds of dates accepted by last_reviewed_date; does that sound reasonable @NickWalt01 @richardTowers?

NickWalt01 commented 1 year ago

Hi @lfdebrux

Date.strptime("2022-08-22", '%Y-%m-%d') is accepted as a valid date, but Date.strptime("2022/08/22", '%Y-%m-%d') is not valid, so for me yes that is acceptable and I agree that it is better to restrict the date format that is used in the doc files to YYYY-MM-DD

richardTowers commented 1 year ago

I'd be fine with restricting dates to '%Y-%m-%d', but I think there's a tricky interaction here between ruby / yaml / middleman / tech docs gem.

The issue is that if you put a date in your frontmatter like:

---
date: 1970-01-01
---

Middleman will parse this YAML using YAML.load, which will now error in ruby >=3.1 because of the switch to safe_load. That's a bug in middleman in my opinion. Users of middleman that don't use the tech docs gem can work around this by using any format which ruby doesn't interpret as a date, but the tech docs gem actually wants a date so our users are between a rock and a hard place.

I don't think the tech docs gem gets involved early enough in the loading of a template to fiddle with the YAML frontmatter before it's parsed (but maybe it does?).

I guess we could also consider monkey patching middleman while we wait for a release... 🙈

NickWalt01 commented 1 year ago

Looking at the Middleman commit history and the last tag release date, I can't see a new release soon, happy to wait for a new one as the docs work with the older version of Ruby, but 2.7 will be phased out soon:

Ruby 2.7 status: security maintenance release date: 2019-12-25

Ruby 2.6 status: eol release date: 2018-12-25 EOL date: 2022-04-12

lfdebrux commented 1 year ago

Right, thanks for spelling it out for me @richardTowers. How frustrating.

richardTowers commented 1 year ago

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

NickWalt01 commented 1 year ago

Hi @richardTowers thanks for the update, nice, we will give it a try with the new versions

stevenjmesser commented 1 year ago

There's now a release of Middleman with support for Ruby 3.1 - v4.5.0. In theory, this bug should be fixed on middleman 4.5.0, but I haven't tested it yet.

I ran bundle update on design-system-team-docs and it seems to have fixed the problem using YYYY-MM-DD. I had to lock the contracts gem at 0.16.1 because 0.17 errored, but all seems present and correct. Relevant changes in this PR.