Add markdownlint Automatic Auditing Feature

ywh555hhh commented 3 months ago

This is a Feature Request

What would you like to be added

I would like to add an automatic markdownlint auditing feature to our website project. This feature would automatically check and highlight potential formatting issues and errors in Markdown files during each commit or edit, ensuring consistency and adherence to formatting standards.

Why is this needed Introducing this feature has several important benefits for our project:

Improves the quality and consistency of documentation, reducing the manual review workload.
Prevents common Markdown formatting errors, such as missing headers and inconsistent formatting.
Enhances project maintenance efficiency, allowing team members to focus more on content creation and updates.

Comments We could consider using the markdownlint tool, which can be integrated into our CI/CD process to ensure automatic auditing on every commit. We can refer to official documentation and example projects for specific configuration and integration methods to implement this feature quickly.

Here are the three main concerns discussed regarding the introduction of the markdownlint automatic auditing feature:

Three Major Concerns

Extensive Updates Required for Existing Documents:
- Many current documents have been around without enforcing Markdown linting, leading to numerous formatting issues that would need significant updates to comply with new checks. This process is not only time and effort-intensive but also requires involvement from other team members who would need to review these updates to ensure nothing is accidentally broken during the process.
Impact on Contributor Experience:
- Many contributors might be accustomed to writing Markdown that looks fine locally but might not be entirely spec-compliant. This could lead to situations where their submissions appear fine locally, but build jobs fail on their PRs due to formatting issues, which can be frustrating for both occasional and regular contributors. Their efforts might be rejected due to seemingly trivial issues.
Technical Implementation and Integration Challenges:
- While introducing markdownlint as a tool is a great idea, the Kubernetes community currently uses Prow instead of GitHub Actions for automation processes. This presents a technical challenge on how to effectively run Markdownlint as a Prow job and have the results commented on the GitHub PR. This requires additional configuration and integration efforts, possibly involving the development of new tools or adapters to bridge these technological gaps.

k8s-ci-robot commented 3 months ago

This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

ywh555hhh commented 3 months ago

Can you assign it to me?

stmcginnis commented 3 months ago

Having been someone that has added this to multiple projects in the past, and someone that is a big proponent of well strucutred and "correct" markdown, I'm a little mixed on this proposal.

There are a couple concerns I have for adding this to k/website.

The current docs have been around for quite some time without enforcing markdown linting. This means there are a lot of issues everywhere in the content that will need to be updated. This will cause a lot of churn and failing tests that will take some effort and attention. Not just for you to make the updates, but for the folks that will need to review those updates and verify nothing accidentally gets broken.

The other concern is the contributor experience. A lot of people are very used to writing markdown that works but is not quite fully spec compliant. I've found it can be very frustrating for casual (and even regular) contributors to write some docs that render just fine locally, only to have jobs fail on their PR for what really ends up being a trivial issue.

So no doubt this would improve the overall quality of our markdown. And it likely would uncover a few cases where there are rendering issues. But as a whole it needs to be decided if this is worth the effort, and if so, how to go about getting all of the existing content in a state that will pass linting without being too disruptive.

Okabe-Junya commented 3 months ago

I agree with your opinion, stmcginnis. While Markdownlint is an excellent tool and the effort to introduce it is a great idea, there will likely be some challenges.

Another concern is that the Kubernetes community uses Prow instead of GitHub Actions. Is there a good way to run Markdownlint as a Prow Job and have the results commented on the GitHub PR?

sftim commented 3 months ago

A halfway house is to have the job, and have it fail when new changes make a page subjectively “worse“, but still allow a merge.

ywh555hhh commented 3 months ago

I also believe that the concerns raised by the seniors are worth discussing, and I'd like to share my thoughts on the three concerns:

Indeed, this requires attention, and I suggest we could start by limiting the scope of affected markdown files. Perhaps we can initially focus only on README documents and later consider extending this to other markdown documents.
As for this point, markdownlint itself can help fix some formatting issues. Could we possibly delegate this task to a robot, or incorporate automatic fixes into the CI process?
I am not very familiar with this aspect yet, but I will make an effort to learn more about it in the future.

Okabe-Junya commented 3 months ago

@ywh555hhh

I was checking out some discussions on CI and automation. Some helpful PRs / discussions here:

https://kubernetes.slack.com/archives/C01672LSZL0/p1724091591242469?thread_ts=1724089319.657699&cid=C01672LSZL0
- Comment about GHA permissions from GitHub Admins. It seems they do not allow GHA with permission: write. (This is due to a past incident, but I am not familiar with this details.)
https://github.com/kubernetes/test-infra/pull/33333
- PR to introduce Markdownlint to etcd-io/website. I believe moving this PR forward will lead to its introduction in k/website as well.

Okabe-Junya commented 3 months ago

A halfway house is to have the job, and have it fail when new changes make a page subjectively “worse“, but still allow a merge.

Completely Agree. Linting the entire codebase from the first step will cause many CI errors and confusion.

ywh555hhh commented 3 months ago

@ywh555hhh

I was checking out some discussions on CI and automation. Some helpful PRs / discussions here:

https://kubernetes.slack.com/archives/C01672LSZL0/p1724091591242469?thread_ts=1724089319.657699&cid=C01672LSZL0

Comment about GHA permissions from GitHub Admins. It seems they do not allow GHA with permission: write. (This is due to a past incident, but I am not familiar with this details.)

etcd: Add etcd-website lint presubmit job test-infra#33333

PR to introduce Markdownlint to etcd-io/website. I believe moving this PR forward will lead to its introduction in k/website as well.

Thank you. I'll take the time to look at it as soon as possible

ivanvc commented 3 months ago

Hi, team. I see that my PR (kubernetes/test-infra#33333) was referenced from this issue. This issue may be related to the work we've been doing at the etcd website. We didn't have a Markdown linter. The main challenge is that enabling it across all the documents on the website will keep CI runs failing, as we don't have a plan/bandwidth to address the warnings across all files. However, we decided to do an incremental check, and we'll be checking (and returning an error) if there are violations in the changes from the PR. Otherwise, it will just raise a warning.

Please take a look at our issues and PRs. They may help set a path/decision on how to do it for k/website:

k8s-triage-robot commented 4 days ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Okabe-Junya commented 4 days ago

folks, is anyone working on this issue?

ywh555hhh commented 4 days ago

Very busy recently

kubernetes / website

Add markdownlint Automatic Auditing Feature #47418