Open Yokozuna59 opened 1 year ago
This is exciting @Yokozuna59 ! Once you have a working parser, can you create a PR like https://github.com/mermaid-js/mermaid/pull/3432, so that we can see how much the size would increase with Langium?
I'd love to see the existing parsers modernised!
Once you have a working parser, can you create a PR like #3432, so that we can see how much the size would increase with Langium?
Sure! I'm currently planning to support the pie chart since it's the easiest one. But I think I should support the common syntax and configuration first: comments, directives, accessible titles and descriptions, etc.
I've got to improve the current comments and directives regexes since they aren't covering some weird cases; e.g., in #1645, if text has been wrapped inside {}
it won't be skipped and shown as text instead.
classDiagram
class Cat {
+int age %% not skipped comment
}
@sidharthv96 It doesn't look like Langium or Chevrotain are customizable enough in our use case.
I was trying to match the diagram title (and accessible title and description in the same way), so I wanted to match everything after title:
and set the value of it as title, but it doesn't seem that there is a clean way using Langium or Chevrotain from this discussion.
I think I was able to do that in chevrotain.
I think I was able to do that in chevrotain.
Although not in the same way, I was able to do it too.
I've finished accTitle
and single and multi-line accDesc
with their unit tests. But I haven't committed yet; some minor changes might be needed. Then I'm planning to support directives, which might take a lot of time because of the weird style and number of valid values.
I think accTitle
and accDescr
are done with their test cases. (Some weird test cases that are not handled have .todo
).
Can some verify I haven't forgotten anything? link
Then I'm planning to support directives, which might take a lot of time because of the weird style and number of valid values.
Hmmmm, this is kinda a hack, so sticking it langium is probably the ideal way of doing this, but if you're feeling lazy, I wonder if it's easier to do this a pre-proccess step, e.g. something like:
const diagram = `%%{init: { **insert argument here**}}%%
graph TD
A-->B`;
const directiveRegex = /%%(\{[\w\W]*\})%%/gmU; // use U-flag to make `*` lazy
const directives = diagram.matchAll(directiveRegex);
const diagramWithoutDirectives = diagram.replace(directiveRegex, "");
const parsedDiagram = yourLangiumParser.parse(diagramWithoutDirectives);
It seems like directives are more like a metadata thing that applies to all diagrams, rather than something part of each diagram, so it might make sense to split it away from the diagram parser :shrug:
Edit:
Just thought of something else, all mermaid diagram definitions support using a YAML front-matter before the diagram, e.g. like:
```mermaid
---
# some YAML here
title: "Hi"
---
pie
"Rats" : 15
The way this is currently implemented is by doing a pre-processing step on all diagrams, so doing the same thing with directives wouldn't be too weird.
> Can some verify I haven't forgotten anything? [link](https://github.com/Yokozuna59/mermaid-lint/tree/master/packages/mermaid-parser/tests/common)
From a quick look, it looks great! Great work :+1:! The `it.todo` tests are probably the most important unfortunately, otherwise it would break existing diagrams.
I agree with @aloisklink, yaml and init directives can be handled externally for now as they aren't super important for our core usecase which is formatting and parsing the actual diagrams.
Hmmmm, this is kinda a hack, so sticking it langium is probably the ideal way of doing this...
I can create a general directive parser (still not easy), but it won't be strictly typed. So it won't suggest anything if the user tries to write some directives, unlike the strictly which would suggest according to what the user has typed using LSP. For example:
...but if you're feeling lazy, I wonder if it's easier to do this a pre-proccess step, e.g. something like:
I tried tbh, but it would hard to preform with my experience with langium. It can be removed using hidden terminal rule:
hidden terminal DIRECTIVE: "%%{" -> "}%%\r?\n";
It seems like directives are more like a metadata thing that applies to all diagrams, rather than something part of each diagram, so it might make sense to split it away from the diagram parser shrug
Not really sure tbh, but I won't think about it for now and will focus on other stuff.
Just thought of something else, all mermaid diagram definitions support using a YAML front-matter before the diagram, e.g. like:
--- # some YAML here title: "Hi" --- pie "Rats" : 15
The way this is currently implemented is by doing a pre-processing step on all diagrams, so doing the same thing with directives wouldn't be too weird.
I think it won't be that hard to implement this since it's wrapped differently from comments (%%) and only appears at the top of the file, but I'm not really sure what possible values can be there since I couldn't find them in the docs.
From a quick look, it looks great! Great work +1! The
it.todo
tests are probably the most important unfortunately, otherwise it would break existing diagrams.
I was trying to create a general regex that make sure that there is no accTitle
or accDescr
before title but appears that it won't be easy. You can try it here and please inform if you could come up with one. https://regex101.com/r/rGuOBM/1
I figured out the solution for those it.todo
test cases; it appears that I was using Langium in the wrong way. However, there is a limitation on the Langium side that I couldn't solve completely. If this PR merged langium/langium#1067, then I would be able to continue working on the parser.
@Yokozuna59 will be developing this package in their repo, and we'll move it into mermaid once it's stable for production use. This will replace mermaid's current JISON parser once the team decides.
I wonder if there's any way to verify that the new parser is backwards compatible with the syntax supported by the old parser?
I wonder if there's any way to verify that the new parser is backwards compatible with the syntax supported by the old parser?
I'm currently creating unit tests for common stuff, for example, title, accTitle, accDescr, etc. And will add unit tests for each diagram. Currently, only pie chart and the above-listed common stuff have test cases. And I add more test cases for each edge case and valid syntax while developing. I could take the test cases for the current parser and add them to mine to validate that they're working fine.
What do you suggest?
Should I support multiline comments #1281 with it? I came across it while a ago. Please mention me for similar stuff if you could since there isn't a parser
label in mermaid issues. I would suggest adding one if you can! Maybe adding other label like renderer
and lsp
would be helpful for other contributors.
@Yokozuna59, let's keep your implementation as close to the current parser as possible. We can add new features and fix bugs once we make the switch. Otherwise it'll be difficult to test.
Regarding testing & verification, unit tests and visual tests will definitely be required and helpful, but they won't cover all the usecases.
Two ideas I had (which seems a bit over engineered) were,
Anything else?
If your pie implementation is ready, can you raise a PR like https://github.com/mermaid-js/mermaid/pull/3432, so we can test the size increase?
@Yokozuna59, let's keep your implementation as close to the current parser as possible. We can add new features and fix bugs once we make the switch. Otherwise it'll be difficult to test.
Okay, I'll remove the multi line comment for now.
Regarding testing & verification, unit tests and visual tests will definitely be required and helpful, but they won't cover all the usecases.
Two ideas I had (which seems a bit over engineered) were,
- Fuzzing the current parsers to find many valid edge case unit tests, and then running that suite on the new parser.
- Verifying syntax diagrams. But this requires syntax diagrams to be generated from the current jison files, which might be difficult.
Anything else?
I'm not really sure how I can help with that. Before implementing the parser in langium
, I tried looking at those jison
files, but I wasn't able to understand much or get edge cases. If you point them out I would be more than happy to verify and test them.
If your pie implementation is ready, can you raise a PR like #3432, so we can test the size increase?
The Langium PR haven't been merged yet, so the title and accessibilities would give wrong result:
- awesome title
+ title awesome title
There is an alternative way to implement it; thanks to @msujew for pointing it out, I'll implement it then create a PR.
@Yokozuna59 you can focus on langium for now. Don't worry about the verification part. I was sharing that to get input from the community.
@sidharthv96 I have updated the above issue and this is the results for the current test cases:
Most common cases are handled, should I create a PR? Or figure out how to solve those issues? Some of them appears aren't from my side.
Never mind, all of the above issues were solved in mermaid-parser@0.2.0
.
I guess Pie Chart
is now completely supported (without directives or yaml).
@sidharthv96 I would like to convert the Git Graph parser to use Langium, can I tag along this issue or should I create a new one? Perhaps we can add a checklist of the remaining parsers and make this an epic of sorts
@NicolasCwy you can start with a draft PR directly, no need to create an issue. :)
The flow what we've seen till now is usually
Stage 1 and 2 can be separate PRs, so we can get stuff in easily.
Few reference PRs
From this conversion. It would be great to reimplement the parser in langium instead of jison the following reasons:
And much more!
I'm currently working to create a separate package using it here Yokozuna59/mermaid-lint. Anyone is welcome to contribute and help me! Even thoughts would be fine.
Here is an example I've built using Langium for pie charts (toggle the icon next to the link to show the AST). playground
related issues:
4
873
2523
3261
3694
3100
2566
2208
1810
3530