duo-labs / markflow

Make your Markdown sparkle!
Apache License 2.0
20 stars 5 forks source link

MarkFlow

Welcome to MarkFlow. This tool automatically reformats your Markdown to provide consistent looking Markdown files that look pretty similar to HTML that would be generated by them.

Quickstart

To use this tool, install it with pip then run markflow:

pip install markflow
markflow SOMETHING.md

To install from source, assuming you already have poetry installed, from the project directory, run:

poetry install
poetry run markflow

Just want to see if there will be any changes? Use the --check flag:

markflow --check $PATH_TO_MARKDOWN_FILE

For all features, we've got a help:

markflow --help

Enforced Rules

The tool ensures that the following rules are enforced for each different type of Markdown section. For all sections, trailing spaces on each line are removed. It also ensures that Markdown files end with a single newline and newlines are all '\n'.

This tool uses the Markdown standard defined by CommonMark 0.29. It is expected to evolve with the standard and this section will be updated as support is added. If you notice any discrepancies, please open an issue.

Block Quotes

Block quotes are fixed up with proper indentation markers for indented quotes, quote indicators have any space between them removed, and unescaped > that could be confused with quote markers are escaped. e.g.:

>
> > Text >
> >
>
> > Ice Cream \> 0O0>
>

becomes:

>
>> Text \>
>>
>
>> Ice Cream \>  0O0>
>

Code Blocks

Fenced codeblocks have any whitespace stripped from their markers and then printed out as usual.

``` markdown
# Markdown code

becomes

````markdown
```markdown
# Markdown code

Indented code blocks simply have their trailing whitespace removed.

### Footnotes (or Link Reference Definitions)

Footnotes will have their whitespace corrected and their titles wrapped. The tool will
however respect what line URLs should appear on, even if they overflow. For example, the
next two examples would be unchanged.

```markdown
[really_really_really_long_link_that_could_go_on_a_new_line]: /but/doesnt/because/the/tool/understands/that/you/may/not/want/that
[short_link]:
/that/stays/on/separate/lines
'Even if title would fit'

Titles will be kept on whatever line you write them on, as long as they wouldn't be wrapped off the line.

[really_really_really_long_link_that_could_go_on_a_new_line]: /but/doesnt/because/the/tool/understands/that/you/may/not/want/that "But the title is moved to the next line and itself is wrapped because it is also really long."

becomes:

[really_really_really_long_link_that_could_go_on_a_new_line]: /but/doesnt/because/the/tool/understands/that/you/may/not/want/that
"But the title is moved to the next line and itself is wrapped because it is also really
long."

Headings

Heading lines begin and end with no whitespace. If you're using ATX headings (leading #s), but will correct missing or extra spaces between the octothorpe's and the heading.

#Non-Standard Heading

becomes

# Non-Standard Heading

If you are using setext headings (i.e., underlined headings), they will automatically be fixed to ensure underlining matches the heading length. e.g.:

Heading 1
--

becomes

Heading 1
---------

If you have a heading that extends beyond an entire line, MarkFlow will wrap it for you.

This is a really long heading that I had to make up so that it would be at least 88 characters long
--

becomes

This is a really long heading that I had to make up so that it would be at least 88
characters long
-----------------------------------------------------------------------------------

Lists

Lists will be corrected to proper indentation. In addition, ordered lists will be properly numbered and bullet lists will be reformatted to use consistent bullets. Line lengths are also enforces. e.g.:

2. One
    * Asterisk
  - Dash
1. Two
5. Three

becomes

2. One
  * Asterisk
  * Dash
3. Two
4. Three

CommonMark doesn't allow lists to start with 0. That's not really a big deal for this tool, so we are OK with that. If this causes you issues, please let us know by opening an issue.

Paragraphs

Paragraphs are reformatted to ensure they are the proper length. URLs and footnotes are properly split across lines. Inline code is placed all on a singular line. e.g. (assuming a line length of 1):

test `test =
1` [url](http://example.com)

becomes:

test
`test = 1`
[url](
http://example.com)

Separators

Separating lines (i.e., blank lines) contain only new lines, removing any horizontal whitespace.

Tables

Tables are reformatted to ensure proper width and headings are centered and all cells have at minimum one space between their contents and column separators. Alignment is supported too! e.g.:

|L|C|R|N|
|:--|:-:|--:|---|
|a|a|a|a|
|aa|aa|aa|aa|
|abcde|abcde|abcde|abcde|

becomes:

| L     |   C   |     R |   N   |
|:------|:-----:|------:|-------|
| a     |   a   |     a | a     |
| aa    |  aa   |    aa | aa    |
| abcde | abcde | abcde | abcde |

Thematic Breaks

Thematic breaks are extended or reduced to match the length of the document. If line length is set to infinity, it will instead use 3 of the separating character which must be one of -, _, or *.

-- - -

becomes:

----------------------------------------------------------------------------------------

API Reference

The tool also provides a function to reformat Markdown strings yourself.

from markflow import reformat_markdown_text

markdown = "   # Header 1"
nice_markdown = reformat_markdown_text(markdown, width=88)

Contributing

To contribute to this project, check out our contributing guide.

Issues

If you run into an issue running a Markdown file, feel free to open an [issue][ issues]. If you can include the faulting file, that will make it so much easier to debug.

This script can help in anonymizing your file if you have any confidential information in it.

#!/usr/bin/env python3
""" Anonymize file XXXX.md and output it to XXXX.out.md """
import pathlib
import random
import string

FILE_NAME = "XXXX.md"
input_path = pathlib.Path(FILE_NAME)
output_path = pathlib.Path(".out.".join(FILE_NAME.rsplit(".", maxsplit=1)))
text = input_path.read_text()
output = ""

for char in text:
    if char in string.ascii_lowercase:
        char = random.choice(string.ascii_lowercase)
    elif char in string.ascii_uppercase:
        char = random.choice(string.ascii_uppercase)
    output += char
output_path.write_text(output)

Implementation

To read more about how the tool works, checkout the implementation outline.

Credits

This tool was inspired by a coworker not enjoying having to manually reformat Markdown files. He wanted a tool that would enforce it like black does for Python code. That is why the line length default is 88.

A Bonus Note on Block Quote Formatting

Escaping > is especially important for the tool itself as otherwise updated block quotes could be too deep. For instance, incorrect wrapping here could result in an extra indented block of code.

> Please don't wrap after this period. >
> Because I don't want to be a double quote.

becomes:

> Please don't wrap after this period.
> > Because I don't want to be a
> double quote.

which would format to:

> Please don't wrap after this period.
> > Because I don't want to be a
> > double quote.

Of course, if the tool tried that, it would throw an exception since it double checks that if it were to be rerun the output would not change, at which point, hopefully, dear reader, you would open an issue. But I get it if you don't want to. I've been there.