Drop Ruby as a build dependency

ffrank commented 5 months ago

We have been using Ruby for two things

a test script to find non-conforming YAML files (which has not worked in a long time and had been disabled)
the make yamlfmt target, which is a very hacky approach to having gofmt but for YAML, which is also an odd proposition, given that whitespace has semantic meaning in YAML

This change adds the Python based yamllint to our build dependencies. It has a much smaller footprint compared to bringing in Ruby, especially for folks who want to build mgmt on their regular work/play machines.

The test script now works and gives actionable error output.

The yamlfmt target essentially does the same thing now, though. No automatic "fixing" of YAML formats anymore, but then we probably also don't expect to write that much YAML any longer.

Linter settings were chose so that few changes are needed in existing YAML. It's another benefit of this approach that we can now curate our linter configuration.

ffrank commented 5 months ago

I have no idea why the file-mode.sh test suddenly fails in github CI. Works On My Machine.

purpleidea commented 5 months ago

I have no idea why the file-mode.sh test suddenly fails in github CI. Works On My Machine.

Race bugs in testing env. I will disable it.

purpleidea commented 5 months ago

then we probably also don't expect to write that much YAML any longer.

Quite true. I expect a few incoming config file things, but in general it's not as much of an issue.

ffrank commented 5 months ago

In github CI I realized we have two more ruby users: mdl and fpm. fpm is not so painful because it doesn't need to be part of the build, and if we want to run it in CI, github actions can still just run ruby setup actions.

However, mdl is an entire different story. I tried substituting it with pymarkdownlnt, which has a limitation around tables (I can work around), but also has an annoying bug that occurs when indented blocks appear in enumerations or bullet point lists, and TABs are used for indentation. This begs the questions:

Why do we insist on tabs in markdown? It's not best practice and we explicitly disable the MD010 rule that would warn us about this. Linting life would be much easier without tabs. Would spaces be so bad here?

(This mostly happens in docs/language-guide.md.)

purpleidea commented 5 months ago

Why do we insist on tabs

I love my tabs. For many reasons unfortunately.

Accessibility
Not having to hammer my space bar to indent
Easy of use for when you want to see more or less indents

...

purpleidea commented 5 months ago

(And in theory we should be adding more markdown docs as time goes on, so we definitely want to keep the validator.)

ffrank commented 5 months ago

To illustrate the issue in the previous comment:

$ pymarkdownlnt scan docs/language-guide.md 

Unexpected Error(BadTokenizationError): An unhandled error occurred processing the document.

The issue is a failed assertion in this parser. I believe that it's caused by a bug, and I have been looking into it, but debugging someone else's parser/lexer code is...challenging.

purpleidea commented 5 months ago

While I love the idea of dropping the use of any ruby, it being there is not personally blocking me atm.

But if you're really keen on this issue, maybe this would help though: https://github.com/gomarkdown/markdown ?

ffrank commented 5 months ago

Oh I definitely want to be sure we keep linting the markdown. We should even work towards disabling fewer rules imo.

So yeah I filed https://github.com/jackdewinter/pymarkdown/issues/1015 and will also look at the gomarkdown thingy.

ffrank commented 5 months ago

The tip of this now has a shape like I could imagine it. Let me know your thoughts please.

If you all find this an acceptable way forward, I will squash it down.

Note that I went for pipenv over just jamming pip packages into everyone's systems (except pipenv itself...apparently it's best practice to install it through pip). I hope this will be more sustainable (and not more fragile).

purpleidea commented 4 months ago

Wow, lots of neat work, thanks Felix!

I've just had a short read through as a diff of the whole thing, rather than each commit separately. There are clearly a bunch of great changes I can see already.

There are a few unknowns I'd need to dig into first. My first question is that if we're moving from ruby to python, is this a major improvement? It definitely seems to be from a packaging point of view. I'd like to rule out gomarkdown though first, at least for the mdl alternative.

Lastly, how did you want to squash? (I do not recommend squashing all of it into a single commit.) If there are a few stand-alone / easy commits to squash and/or merge early, maybe I can make this easier to review by merging those first?

I have a very large feature branch that I'll be merging shortly, and my goal is to merge this right after that. I don't think there are would be any major conflicts, but will do it in that order just to be safe.

Cheers!

ffrank commented 4 months ago

There are a few unknowns I'd need to dig into first. My first question is that if we're moving from ruby to python, is this a major improvement? It definitely seems to be from a packaging point of view. I'd like to rule out gomarkdown though first, at least for the mdl alternative.

Python is already a dependency, and no modern Linux box can function without python. It's a lot less noise on contributor's machines in this way. Ruby is rather obscure. Even Puppet users don't install it anymore, since Puppet vendors its own Ruby (go figure).

I will try replacing pymarkdown with gomarkdown before I proceed.

Lastly, how did you want to squash? (I do not recommend squashing all of it into a single commit.) If there are a few stand-alone / easy commits to squash and/or merge early, maybe I can make this easier to review by merging those first?

Mainly I want to get rid of some intermediary steps like ruby-yaml-fix -> yamllint -> yamlfix. Also if I go for gomarkdown, any mention of pymarkdown needs not be in our history.

But yes, otherwise I'm no fan of single-commit branches either. Some history can be helpful in archaeology down the line.

ffrank commented 4 months ago

So gomarkdown does not come with linting functionality. There is gomarkdown/mdfmt, but it's unmaintained and go build fails.

A slightly better maintained version is at https://github.com/shurcooL/markdownfmt/ and it does build, but it's not really customizable, and at a glance, it seems to do the opposite of what we want, e.g. joining our nice and carefully crafted 80 character line paragraphs into single long lines.

purpleidea commented 4 months ago

haha, okay! I'm okay with the python route for now if you think that's preferable. I'm a bit concerned about the open issue and wouldn't mind waiting until that's fixed before we swap it though.

Would you be able to rebase/squash your commits into a few separate ones based on logical functionality that you think makes sense and I'll review and merge those?

purpleidea commented 4 months ago

Or I could pick off the easy ones first if you'd like. LMK

ffrank commented 4 months ago

I will clean up and ping you.

ffrank commented 4 months ago

Okay, go.

ffrank commented 4 months ago

@purpleidea this is ready for review.

ffrank commented 3 months ago

Update, I'm in communication with @jackdewinter in https://github.com/jackdewinter/pymarkdown/issues/1015 in order to solve the limitation around tab characters. Once the fix works, it will allow us to remove the workaround from test-markdownlint.sh.

But I feel that we can merge as is now, and remove the workaround later, so this needs not be blocked.

purpleidea commented 3 months ago

I've cherry-picked a few of these commits. I think we should hold off on the markdown/yaml thing for the moment. I need to have a deeper look into it all. Thanks for the fun commits!

purpleidea / mgmt

Drop Ruby as a build dependency #740