unifiedjs / unified

☔️ interface for parsing, inspecting, transforming, and serializing content through syntax trees
https://unifiedjs.com
MIT License
4.47k stars 110 forks source link

Rethink how `*-stringify` should work #197

Closed JounQin closed 2 years ago

JounQin commented 2 years ago

Personally, I love the idea remark-prettier.

Because even with https://github.com/remarkjs/remark-preset-prettier, remark-stringify's default or few custom formatting can still change the input.

For example, there is no option to preserve two spaces:

Line.  <!-- two spaces here -->
Next line

It will always to formated as:

Line.\
Next line

This is very unexpected for prettier users.

Originally posted by @JounQin in https://github.com/unifiedjs/unified/issues/196#issuecomment-1219284778

JounQin commented 2 years ago

I'd like to have a stringifier which preserve the original formatting as much as possible, except fixed by linters.

wooorm commented 2 years ago

This issue has a very broad title. In English, “rethink”, kinda means throw everything away, and start from scratch. I don’t think that’s needed? If you believe it is, some more information, perhaps in a discussion might be better suited? Your issue doesn’t explain what’s wrong with rehype-stringify for example. And it only lists one example you don’t like in remark-stringify.

For the point you mention: a\\nb vs a␠␠\nb, I have three opinions/points:

a) I believe your style is worse, because 1) editors trim trailing spaces away, 2) they are invisible, 3) backslashed line endings work in many other programming languages, 4) the support for them is great since CommonMark (more than 5 years) b) like prettier, remark is also a formatter. It doesn’t give you an option on this (prettier doesn‘t either). That’s the goal of these projects! To not overwhelm with options and pick a good, reasonable style c) This seems like a small thing, that you can open an issue about in mdast-util-to-markdown? Or are there other things you are wondering about?

I'd like to have a stringifier which preserve the original formatting

This is neither prettier nor remark-stringify or rehype-stringify. I don’t think this is a good. As a programming community, we seem to be moving away from that towards opinionated tools that format things. rustfmt for example works the same. or dprint. And prettier has saved millions of hours by choosing one style. I think that’s a good thing?

ChristianMurphy commented 2 years ago

I agree with @wooorm's point that the AST + stringifier already make remark a bit of a formatter. Adding on some more, ASTs in general will always lose some formatting. There are CSTs as well, which can preserve all formatting. Unified already can support CSTs, but there isn't a markdown CST project currently.

There has been some interest over the years. (related discussions from over the years 1, 2, 3, 4) We'd probably want to establish a new project/organization for that, as it has significantly different structure than remark does and would have no plugin compatibility.

wooorm commented 2 years ago

Also, more broadly, I definitely think remark-stringify / mdast-util-to-markdown can be improved! A lot, probably. There’s some open issues on my ideas for that: https://github.com/syntax-tree/mdast-util-to-markdown/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc. With https://github.com/syntax-tree/mdast-util-to-markdown/commit/52c18b4c59cd8e1580a6b837b22a8fa890fe7654, I added the basis for line wrapping at a certain size. The core is there but I haven’t had time to expand on it yet except for JSX, to choose whether to print all attributes on one line, or each on their own line: https://github.com/syntax-tree/mdast-util-mdx-jsx/commit/7eede29ea517a3152db158faaed34fbb3d4f1076.

JounQin commented 2 years ago

In English, “rethink”, kinda means throw everything away, and start from scratch. I don’t think that’s needed?

I'm sorry if it is confusing, I'm not a native English speaker.

a) I believe your style is worse

I agree. However, the source codes/docs are not written by me, and it's a personal taste. If one style should never be used then it should exist at all. The fact is it is a spec documented style.

b) like prettier, remark is also a formatter. It doesn’t give you an option on this (prettier doesn‘t either). That’s the goal of these projects! To not overwhelm with options and pick a good, reasonable style

remark itself is not a formatter, but remari-stringify is? So maybe we should create a new formatter to preserve the styles as much as possible except linter reports?

c) This seems like a small thing, that you can open an issue about in mdast-util-to-markdown? Or are there other things you are wondering about?

The problem is I just want to fix error reported by lint rule A but the whole document must be formatted and could be very unexpected.

This is neither prettier nor remark-stringify or rehype-stringify. I don’t think this is a good. As a programming community, we seem to be moving away from that towards opinionated tools that format things.

I think we have X-Y problem here now.

I have question X that I want to fix lint error more specifically, but I reported question Y that we should have a new stringifier. 🤣🤣

My bad. And idea to fix problem X? The problem X occurs when I integrate remark-lint within eslint-mdx, I only want to fix lint issues reported, but the whole document is formatted and I don't know how to fix it easily.

wooorm commented 2 years ago

If one style should never be used then it should exist at all.

I don’t understand this. There are many things in JavaScript, say, eval, extending prototypes, creating string object with new String("stuff"), millions of spaces, etc, that shouldn’t be used, while being valid?

the source codes/docs are not written by me,

From your previous comments, I thought you liked Prettier. But, Prettier also formats almost everything into a single consistent style, regardless of what was written, with only a few options. So it seems like you don’t like prettier either?

Why are you using remark-stringify if you don‘t want to format markdown?

So maybe we should create a new formatter to preserve the styles as much as possible except linter reports?

Please read https://github.com/unifiedjs/unified/issues/197#issuecomment-1219786162.

remark itself is not a formatter, but remari-stringify is?

remark-stringify formats an AST. remark as a whole can be used with remark-stringify to format an AST.

The problem is I just want to fix error reported by lint rule A but the whole document must be formatted and could be very unexpected.

Then you are only interested in a tiny part of unified: remark-parse + remark-lint. That is fine: don’t use the rest of it. But remark is more powerful than that, I recommend benefitting from it. I recommend using remark-stringify to format your markdown. Similar to how you use ESLint + prettier.

but the whole document is formatted and I don't know how to fix it easily.

Then don’t use remark-stringify. Please see the diagram here: https://github.com/unifiedjs/unified#overview. You can use the parse and run phases without using the stringify phase. And then you use the messages with the original file to format yourself.

JounQin commented 2 years ago

I don’t understand this. There are many things in JavaScript, say, eval, extending prototypes, creating string object with new String("stuff"), millions of spaces, etc, that shouldn’t be used, while being valid?

So a␠␠\nb should be allowed, and not changed. Of course, it's my mistake that I choose to use remark-stringify previously.

So it seems like you don’t like prettier either? Why are you using remark-stringify if you don‘t want to format markdown?

I like prettier. 😂

I mean, when I contribute to another project, I want to introduce remark-lint with fixing automatically. And the original maintainer prefers a␠␠\nb, so when use remark . --quiet --frail --output, it uses remark-stringify by default. Then all a␠␠\nb are formatted as a\\nb, this is very unexpected. I just want to fix incorrect formats reported by remark-lint plugins.

Please read https://github.com/unifiedjs/unified/issues/197#issuecomment-1219786162.

Thanks, I read, and I understand this is limitation due to AST. I'm just telling the truth that --output with default remark-stringify could be unexpected.

Then don’t use remark-stringify.

Right, that's what I'm saying for remark-prettier. 🤣

You can use the parse and run phases without using the stringify phase. And then you use the messages with the original file to format yourself.

It's a hard task to fix all lint messages from remark-lint at once, although I haven't tried. And remark-cli with --output does not have this support, right?

wooorm commented 2 years ago

when I contribute to another project, I want to introduce remark-lint with fixing automatically

If you add prettier to a project, it will reformat files too. I don’t understand why you like prettier but don’t like remark if they both format files.

If you only want to add remark-lint, then do not use --output?

that's what I'm saying for remark-prettier.

We don’t need to invent Gulp/Grunt/npm scripts again to support two tools working after each other:

remark . -qfo && prettier . -w --loglevel warn

And remark-cli with --output does not have this support, right?

You were talking about using unified in eslint-mdx before. I don’t think you use remark-cli inside eslint-mdx?

JounQin commented 2 years ago

If you add prettier to a project, it will reformat files too. I don’t understand why you like prettier but don’t like remark if they both format files.

prettier will not change a␠␠\nb to a\\nb

If you only want to add remark-lint, then do not use --output?

How can I fix the remark-lint errors quickly?

You were talking about using unified in eslint-mdx before. I don’t think you use remark-cli inside eslint-mdx?

It's the same thing for remark --output vs auto fixing by eslint-mdx? I'm not using remark-cli inside eslint-mdx.

The problem is I want to fix remark-lint errors quickly without format all unrelated contents.

wooorm commented 2 years ago

prettier will not change X

Right. It is different. If you don’t like remark, use prettier, that’s fine.

How can I fix the remark-lint errors quickly?

You can’t.

It's the same thing for [...]

I don’t understand this question.

The problem is I want to fix remark-lint errors quickly without format all unrelated contents.

You can’t.

JounQin commented 2 years ago

Right. It is different. If you don’t like remark, use prettier, that’s fine.

I like remark-lint + prettier. 😂

I don’t understand this question.

Hmm... I'm not quite sure, but I thought remark . -qfo vs eslint . --fix (with .remarkrc and eslint-mdx), it results same. Please point me if I'm incorrect.

You can’t.

That's the question. 😂

wooorm commented 2 years ago

I like remark-lint + prettier. 😂

Then don’t use remark-stringify.

But I recommend you start liking remark-stringify. It can be improved. But I believe it is better at markdown than Prettier. Also, Prettier is using remark inside it. So almost everything you like about Prettier we can do too.

That's the question. 😂

it results same.

Not really. They can do similar things. But ESLint/prettier are one single. unified is all the tools

It is very important to understand these diagrams I posted before.

ESLint/prettier are one step:

|     TS     |     babel   |    eslint    |    prettier   |
|            |             |              |               |
|    tree    |     tree    |     tree     |     tree      |
|   /    \   |    /    \   |    /    \    |    /    \     |
|  /      \  |   /      \  |   /      \   |   /      \    |
|string    string        string        string       string|

unified:

|                       unified                           |
|   parse    |  transform  |     check    |    format     |
|            |             |              |               |
|    tree -------- tree -------- tree --------- tree      |
|   /                                               \     |
|  /                                                 \    |
|string                                             string|

unified (this package) is not like eslint. unified is not like prettier. It supports plugins that do things with the AST. It supports plugins that check things, like eslint. It supports plugins that format things, like Prettier. Plugins that inject a table of contents (remark-toc). Plugins that turn HTML into markdown (rehype-remark). It is impossible for remark-lint (one small step in the whole chain) to do fixes if the AST represents a completely different file than the input file.

You can build something on top of unified that does these things! But unified (this package) itself does not do that.. You can use unified inside prettier (they already do). You can use unified inside eslint (they already do in eslint-markdown, you already do in eslint-mdx).


This original issue was about *-stringify in this package, so the core unified package. stringify in this package does many things currently well:

I don’t think this package needs to change. Of course, if you want to improve other packages, feel free to raise issues there.

JounQin commented 2 years ago

Then don’t use remark-stringify.

But I recommend you start liking remark-stringify. It can be improved. But I believe it is better at markdown than Prettier. Also, Prettier is using remark inside it. So almost everything you like about Prettier we can do too.

I personally don't have an opinion how markdown is stringified, I'm working on a cooperation project with others, they may don't like a␠␠\nb over a\\nb, and I don't want to judge. The gap between remark-cli vs prettier is remark-lint. prettier is not a linter, so it does not handle remark-lint reports. remark-cli is a linter + formatter, but its formatter change this inputs unexpectedly.

Not really. They can do similar things. But ESLint/prettier are one single. unified is all the tools unified (this package) is not like eslint. unified is not like prettier.

I understand your position. That's OK, and that's why I'm raising this issue for discussion.

I don’t think this package needs to change. Of course, if you want to improve other packages, feel free to raise issues there.

I think I'm going to write another remark plugin doing what you proposed previously without remark-stringify:

You can use the parse and run phases without using the stringify phase. And then you use the messages with the original file to format yourself.


Thanks for your patience and long time guidance here! Cheers! 🍺

Closing with ❤️

github-actions[bot] commented 2 years ago

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.

github-actions[bot] commented 2 years ago

Hi team! Could you describe why this has been marked as external?

Thanks, — bb

wooorm commented 2 years ago

I marked this issue as external because this does not need a change in this package. Some parts of the discussion can be solved by a) adding an option to remark-stringify to use hard breaks with spaces instead of an escape b) adding actual and expected to remark lint rules. That way, like other messages, you can collect those an patch them in the original document