Closed fgarcia closed 1 year ago
Hi!
Normally I would expect from most AST to preserve the original content before any explicit manipulation.
I don’t know of any AST tool that behaves as you describe. ASTs are by definition lossy. Their abstract.
Syntax trees come in two flavors:
- concrete syntax trees: structures that represent every detail (such as white-space in white-space insensitive languages)
- abstract syntax trees: structures that only represent details relating to the syntactic structure of code (such as ignoring whether a double or single quote was used in languages that support both, such as JavaScript).
So this is impossible. You can find more by searching the organization: https://github.com/search?q=org%3Asyntax-tree+cst&type=issues. Here’s a search that looks through our other organizations too: https://github.com/search?q=CST+user%3Awooorm+org%3Amdx-js+org%3Amicromark+org%3Aremarkjs+org%3Arehypejs+org%3Aretextjs+org%3Avfile+org%3Asyntax-tree+org%3Aunifiedjs&type=issues.
Hi! This was closed. Team: If this was fixed, please add phase/solved
. Otherwise, please add one of the no/*
labels.
You might also be running into an XY problem. See our support docs for more info: https://github.com/syntax-tree/.github/blob/main/support.md#asking-quality-questions. Perhaps you can share more about your actual problem: why do you need superfluous whitespace to exist?
I wanted to manipulate a Markdown file and modify keywords only in the section/header lines. It is very easy to do exploring the AST, but I started to notice when converting back that other parts of the document were affected too. Mostly I wanted to modify Markdown and write back to Markdown, not convert to HTML
In the past I did some small JS codemods manipulating the syntax tree and I was lucky never getting unexpected side effects, or even worst, maybe I just never noticed :worried:
You shouldn’t see side effects that actually do something: the whitespace does nothing. If you see things that do affect something, let me know.
JS codemods
Codemods typically work differently. And you can do that with our tools too. They often don’t serialize an AST but change a string. You first need to figure out where things are: our AST gives you positional info for that. Then you can pass that info, and what you want to replace, to something like: https://github.com/Rich-Harris/magic-string.
You may be interested to use remark-cli
/ remark-language-server
/ remark for VSCode in combination with unified-consistency
.
Initial checklist
Affected packages and versions
1.3.0
Link to runnable example
No response
Steps to reproduce
Expected behavior
Normally I would expect from most AST to preserve the original content before any explicit manipulation.
In the code above I was counting on text before and after the conversion (Text -> AST -> Text ) to be the same. However the result above trims the indentation of the second line. I know that when Markdown is converted to HTML those spaces are ignored, but I would not expect the parser not to manipulate the original content in advance.
I expected
before === after
Actual behavior
Currently
before !== after
The value in after drops the empty spaces after the line break "one\ntwo"
Affected runtime and version
node@18.15
Affected package manager and version
No response
Affected OS and version
No response
Build and bundle tools
No response