tcort / markdown-link-check

checks all of the hyperlinks in a markdown text to determine if they are alive or dead
ISC License
562 stars 116 forks source link

Add JSON output option #152

Open Dzordzu opened 3 years ago

Dzordzu commented 3 years ago

Depends on

Proposal

Add a new flag (ex. --porcelain; like in git) that will result in easy to parse output.

Reason

It's quite to parse any output from this amazing tool

Suggested output format

$ERROR_CODE;$PATH_TO_FILE;$PATH_TO_LINK

Example Output

404;./people/mq/01_people.md;.../auth/01_general_concepts.md
403;./people/mq/02_people_address.md;../localization/01_general_concepts.md

Details

Name Value
Urgent No
Complexity ?
OP Support No (at least until 2021.06
NicolasMassart commented 3 years ago

Two comments:

Git

I don't think we should copy Git naming for these options... Git is not the software I would use as a UX example... we get used to it, but it's not intuitive. It's my own opinion and I use it almost since it exists...

Output format

if we output a machine readable format, why not just an already established format? I suggest outputing JSON directly as you can handle it in most languages, and even using shell scripts jq or alike can be used. Option could then be -j, --json. And outputting JSON from a JS software feels very natural.

See for instance what Heroku app does: https://devcenter.heroku.com/articles/heroku-cli-commands

This is something explained also in https://clig.dev/#output and I think we should follow these guidelines:

Display output as formatted JSON if --json is passed. JSON allows for more structure than plain text, so it makes it much easier to output and handle complex data structures. jq is a common tool for working with JSON on the command-line, and there is now a whole ecosystem of tools that output and manipulate JSON.

Dzordzu commented 3 years ago
  1. I mentioned git as an example (to define what I would like to describe). Using any other flag would be great (for example -j, as you suggested)
  2. Good point. JSON would be much better
coanor commented 1 year ago

Need the feature for large document repository.

JSON is pretty,current output format hard to parse using simple grep/sed/awk skills. The following shell would extract 404 errors, but not precisely get the result I need:

sed -n '/^FILE: /,/[✖]/p' /path/to/deadlink-output.log