Open xoofx opened 7 years ago
So I have fixed a few cases with list for better handling loose list.
I have added support for escape characters.
I have also started to add NormalizeOptions
to control a bit the markdown output.
The limit text width will require a bit more infrastructure work on the TextRendererBase
, still nothing done for this (wip)
I have added support for HtmlBlock renderer. The normalize is able to process the CommonMark specs and the output is almost identical to the original which is already a good start.
TaskLists
are marked as done but normalise actually strips the task syntax:
static class Program
{
public static MarkdownDocument Parse(string markdown) =>
Markdown.Parse(
markdown,
new MarkdownPipelineBuilder()
.UseTaskLists()
.Build());
public static string ToText(this MarkdownObject o)
{
using var stringWriter = new StringWriter();
new NormalizeRenderer(stringWriter)
.Render(o);
return stringWriter.ToString();
}
static void Main(string[] args) =>
Console.WriteLine(
Program.Parse("- [ ] example task")
.ToText());
}
prints
- example task
Ah, I see, have to use the same pipeline:
static class Program
{
public static MarkdownPipeline Pipeline =>
new MarkdownPipelineBuilder()
.UseTaskLists()
.Build();
public static MarkdownDocument Parse(string markdown) =>
Markdown.Parse(
markdown,
Program.Pipeline);
public static string ToText(this MarkdownObject o)
{
using var stringWriter = new StringWriter();
var renderer = new NormalizeRenderer(stringWriter);
Program.Pipeline.Setup(renderer);
renderer.Render(o);
return stringWriter.ToString();
}
static void Main(string[] args) =>
Console.WriteLine(
Program.Parse("- [ ] example task")
.ToText());
}
@duncanawoods so your issue is resolved?
Yep - thanks :)
My feedback is - the renderer API could be clearer e.g. taking the pipeline as an argument. The only way I found out I need to setup a pipeline is reading the internals looking to create a PR!
I do not know how @xoofx generally handles it here, maybe discuss the API change ahead of time in a separate issue. I think PRs are generally welcome, also here in the normalizer topic.
I think this issue was meant for tracking the progress of the Normalisation rendering implementation.
Issue proposals and PRs are of course welcome!
I add this case here as it looks similar to the one posted in the comments:
Code:
using var stringWriter = new StringWriter();
var renderer = new NormalizeRenderer(stringWriter);
GetPipeline().Setup(renderer);
renderer.Render(doc);
Console.WriteLine(stringWriter.ToString());
Input string:
# File number 2
Hello, world
<!-- comment 1 -->
<!--
{
"key0":[
"A"
],
"key1":[
"x", "y", "z", "t"
],
"Key3": "key 3"
}
-->
Output string:
# File number 2
Hello, world
[]:
The error is of course the []:
at the end of the output string.
My goal is:
Since I just started working with this library, I may be doing something wrong ...
TIA
The error is of course the
[]:
at the end of the output string.
Yeah... the problem is that the NormalizeRenderer
has never been finished. It's probably several hours of work to fix the remaining errors just for core CommonMark (without extensions). Don't have time nor a personal interest for this feature at the moment, so unless someone is willing to do that work, the NormalizeRenderer
is not really usable.
Hi @xoofx nice to see you here as well instead of some random city around the globe! :)
I am not really interested in NormalizeRenderer
specifically, but just getting back the text for the markdown that I have modified by removing a block.
How can I get back the text of the modified document?
Thank you
Hi @xoofx nice to see you here as well instead of some random city around the globe! :)
Oops, sorry, should have started by saying hello Raffaele! 😉
How can I get back the text of the modified document?
That's what the NormalizeRenderer
was supposed to do (not only normalize but "roundtrip" even if this is a much longer way to support full rountrip)
:)
That's what the NormalizeRenderer was supposed to do ...
argh, this was unexpected. I gave the roundtrip for granted when I started looking for a markdown library :-/
Do you have any suggestion to get rid of the problem for "just" going back to the text? Looping on blocks and their descendants recursively? TIA
argh, this was unexpected. I gave the roundtrip for granted when I started looking for a markdown library :-/
Do you have any suggestion to get rid of the problem for "just" going back to the text? Looping on blocks and their descendants recursively?
That's exactly what NormalizeRenderer
is supposed to do: go through the nodes recursively and print the equivalent Markdown, there is no magic in it.
For the specific error you have, it's probably a small fix, but I can't have a look at it.
@xoofx sorry that I left this project mid-status done. Work life increased a lot and my private project use case vanished.
/cc @xoofx
I just found the bug is on the Parser
and not on the Normalizer.
The parser incorrectly add a LinkReferenceDefinitionGroup at the end of the document
Later on, the Normalizer renders (empty) the []:
tags which stand for the link and do not derive from the html comments.
Given the instability of the parser, unfortunately I have no choice but searching for a new library :(
I just found the bug is on the
Parser
and not on the Normalizer.
I don't think it's a bug in the parser. I believe it is because the LinkReferenceDefinitionGroupRenderer
was incorrectly added in the first place (while there is no renderer like this for the Html renderer).
@xoofx I don't get why the renderer should be involved at that time. The screenshot is what I see on the document right after parsing the markdown. I didn't render anything at that moment.
The screenshot is what I see on the document right after parsing the markdown.
I know and it is "correct". The LinkReferenceDefinitionGroup
is added automatically by the system. Don't know why there is an item in it (that part is maybe a bug here), but LinkReferenceDefinitionGroup
is used for book-keeping links (backward and forward) and made accessible through the AST. In the Html renderer, this element is not rendered and it should not be rendered also in normalize. That's more the actual bug.
As a workaround, without modifying the renderer, you could remove LinkReferenceDefinitionGroup
after the document is parsed..
Thanks anyway @xoofx I was already trying another solution.
I have to test more cases, but apparently the following if
that I added in LinkReferenceDefinitionRenderer
resolves the problem:
protected override void Write(MdRenderer renderer, LinkReferenceDefinition linkDef)
{
if (string.IsNullOrEmpty(linkDef.Url))
{
return;
}
// ...
Note: MdRenderer
is the class I used to replace the renderer.
Note: MdRenderer is the class I used to replace the renderer.
It won't work when you will have links in your document, because LinkReferenceDefinitionGroup
is still rendered and the links in it, while it should not.
You can remove the LinkReferenceDefinitionGroupRenderer
from the normalize renderer directly which is probably a more correct workaround: normalizeRenderer.ObjectRenderers.TryRemove<LinkReferenceDefinitionGroupRenderer>();
You can remove the LinkReferenceDefinitionGroupRenderer
You mean LinkReferenceDefinitionRenderer
, right? ... I did it and it works. Not sure if this change may have side-effects.
BTW LinkReferenceDefinitionGroupRenderer
didn't work.
You mean LinkReferenceDefinitionRenderer, right? ... I did it and it works. Not sure if this change may have side-effects. BTW LinkReferenceDefinitionGroupRenderer didn't work.
Oh right, actually both LinkReferenceDefinitionRenderer
and LinkReferenceDefinitionGroupRenderer
should be removed, otherwise RendererBase is still visiting the children of LinkReferenceDefinitionGroup
(which is a ContainerBlock
)
Forgot that LinkReferenceDefinition
is also not used for rendering, I added them by mistake in the NormalizeRenderer back in the days.
I added them by mistake in the NormalizeRenderer back in the days.
Never mind, it seems that it was added on purpose for cases like this:
This is a [link][MyLink]
[MyLink]: http://company.com
In HTML they are not rendered, but in Markdown, they should be rendered.
So the problem is likely the empty link that is put in the LinkReferenceDefinitionGroup
I haven't touched this code for the past 3 years, so yeah... it's old, and my memory cannot keep up at some point 😅
So, at the end, the if
statement I added as for this comment should solve the problem.
So, at the end, the
if
statement I added as for this comment should solve the problem.
Yeah, it's enough to workaround.
Followup of #17 and #32 and #154
This is an issue to track the remaining work to handle full markdown document round-trip
GFM
markdown, then the rest)GFM
)GFM
)cc: @tthiery