jelovirt / org.lwdita

LwDITA parser for DITA-OT
http://lwdita.org/
Apache License 2.0
25 stars 19 forks source link

Re-consider the use of square brackets to signify keyrefs #225

Open raducoravu opened 3 months ago

raducoravu commented 3 months ago

One of our clients used in their Markdown files square brackets just as text content like:

* abc:  **[float, float16]**
* def:  test,shape = [m, n]

The problem is that the when the Markdown file gets interpreted as DITA XML, there is an attempt to convert the content in the square brackets to key refs, which the person writing the Markdown content did not intend to do. So the Markdown no longer can be used as it was written, someone needs to \ escape the square brackets manually when used with a DITA Map.

raducoravu commented 3 months ago

Another person experiencing a similar problem here:

https://github.com/dita-ot/dita-ot/issues/4304#issuecomment-2130695590

My maps include Markdown source that used [] instead of backticks for what are essentially code phrases.

raducoravu commented 3 months ago

How about if the content inside [...] is not a valid key name or cannot be mapped to an existing key name, the brackets and their content could be preserved exactly as they are instead of being converted to:

 <xref keyref="float, float16"/>

?

jelovirt commented 3 months ago

Keyrefs use shortcut reference link and it's part of Commonmark. The only reason why [float, float16] is not interpreted as a link by the Markdown preview they use, is that there is no link reference definition for that link label in the page.

jelovirt commented 3 months ago

How about if the content inside [...] is not a valid key name or cannot be mapped to an existing key name…

Valid key check is something that could be done, but checking for existing key name is not possible because these key references are in topics and the parser doesn't have access to all key definitions at parsing time.

raducoravu commented 3 months ago

I understand, maybe we could pass some kind of interface to the parser through which it could ask some basic key-related questions? Also if the value inside the "[]" brackets is not a valid key name, we could just keep that value as it is?

jelovirt commented 3 months ago

The problem I see is that if the user has e.g.

* a = [m, n]
* b = [m]

In this case the first case is not a valid key name, so it would come out as [m, n], but the second is a valid key name and it would be converted to a keyref m. Markdown itself is not consistent, because it tries to recover from every error by creating a link if the link definition exists and using fallback to output as is.

Having different fallback in Markdown and Markdown DITA would make it even more confusing.

The best option would probably be to add a feature the disable shortcut reference links as key references. You can either have

The question is, are these users willing and/or able to configure the parser so that they can disable this feature?

raducoravu commented 3 months ago

Probably most users will not be able to configure the parser unless this setting becomes some sort of DITA OT parameter. I would see the most correct Markdown-aligned approach as trying to see if that referenced key name actually exists.

jelovirt commented 3 months ago

For DITA-OT, verifying whether the key exists will work only for preprocess2 for topics, but for maps even that will not work without resolving the shortcut reference after keys spaces are collected. The other problem is that if the user explicitly wants to have a key without definition, something that's a valid use case in DITA content.

grettke commented 3 months ago

Probably most users will not be able to configure the parser unless this setting becomes some sort of DITA OT parameter. I would see the most correct Markdown-aligned approach as trying to see if that referenced key name actually exists.

Is this the kind of situation where at the end of the build you might notify the user with a "NOTICE: You seem to be using feature-xxx. Because this works both in Markdown and DITA-OT different, if you get un-preferred results then..." That could point to details of "Here is how to format it to work as you expect."

The reason I ask is that another way to look at the build, which I feel like you are alluding to, is that the user is either doing a "MARKDOWN ONLY BASED BUILD" or "DITA-OT BUILD THAT HAPPENS TO USE MARKDOWN-DITA." In the former, a notice would be pleasant-enough. In the latter, the user would have a good chance at fixing it.

Disclaimer: FWIW I only study DITA and am not using it so this is all theory.

michael-nok commented 2 months ago

Unescaped square brackets in Markdown indicate a reference link, regardless if DITA-OT is involved.

https://www.markdownguide.org/basic-syntax/#reference-style-links

Click [here] for more info.

[here]: https://www.google.com

Output:

Click here for more info.

Therefore, the correct Markdown format requires the user to escape the square brackets as such:

* abc:  **\[float, float16\]**
* def:  test,shape = \[m, n\]

Output: