Python-Markdown / markdown

A Python implementation of John Gruber’s Markdown with Extension support.
https://python-markdown.github.io/
BSD 3-Clause "New" or "Revised" License
3.76k stars 861 forks source link

Ignore LaTeX blocks for abbr #379

Closed tijptjik closed 9 years ago

tijptjik commented 9 years ago

The abbr extension currently also matches strings within $ ... $ and $$ ... $$ . This is not desirable because it prevents mathjax from rendering the LaTeX

For example, having the following math block

$$\begin{align}
    \text{AIC}=-2 log L(\hat{\beta}) + 2p
\end{align}$$

with the following abbr expansion

*[AIC]: Akaike Information Criterion

will break.

If this can't be the default behaviour, having it configurable would also be helpful.

waylan commented 9 years ago

Is there a LaTeX or mathjax extension you are using? If so, then that extension should be running before anything that may create a conflict. In other words, this is a bug in the that extension. We can't provide exceptions for every third party extension out there. Of course, some adjustments to make it generally easier for third party extensions are welcome.

However, if you are just passing the LaTeX through, then you may need to use markdown's escaping (backslash) to ensure that it is not parsed as markdown. Although, I'm not sure how that would work with abbreviations. I'll have to think about that.

tijptjik commented 9 years ago

Thanks for your response.

I am using MathJax through the sublimetext-markdown-preview package, which loads in MathJax as a javascript library after the markdown has been rendered (in my case with the abbr extension).

So, instead of explicitly ignoring the default MathJax blocks ($...$ and $$...$$), would be it possible to have a configurable option to pass to the abbr extension which allows us to specify which blocks to ignore?

I can imagine if this is an edge case request and you wouldn't want to complicate the codebase just to support it. Though, thanks for your consideration.

waylan commented 9 years ago

Special casing MathJax blocks within the Abbr extension is a nonstarter. As stated previously, the out-of-the-box solution is for you to backslash ascape your mathjax. Of course that can become rather tedious. Therefore a third party extension has been created that finds all mathjax blocks and tells the entire (not just abbr) Markdown parser to ignore them. You can find the extension here:

https://github.com/mayoff/python-markdown-mathjax I cannot recomend this extension at this time. See my comment below.

You may find a few more third party extensions which work similarly listed on the wiki:

https://github.com/waylan/Python-Markdown/wiki/Third-Party-Extensions

The existing parser has no knowledge of mathjax and/or LaTeX, and I have no intention of changing that. However, our extension API makes it possible for third party extensions to add such support (as they already have). That being the case, I see no need to make any changes to Python-Markdown and am closing this issue. If you need any help with the third party extensions out there, please contact the authors of those extensions directly.

mitya57 commented 9 years ago

I am using MathJax through the sublimetext-markdown-preview package, which loads in MathJax as a javascript library after the markdown has been rendered (in my case with the abbr extension).

As @waylan stated, this is wrong. It should parse $...$ and $$...$$ blocks itself, and make sure no other extensions / inline patterns can touch them. (Based on your description, I believe that you also have problems if you use _underlines_ in your math formulas).

I consider my own implementation of MathJax support very good. That code is under 3-clause BSD license, feel free to reuse it.

waylan commented 9 years ago

Thanks @mitya57 for linking to your MathJax implementation. I knew I had seen one before but couldn't remember where. Just yesterday I was looking at the "mayoff/python-markdown-mathjax" project and it has all sorts of issues. I filed two bug reports and can't recommend it until they are fixed along with two existing reports. I wish I had never linked to it above. Have you considered breaking yours out into a standalone extension? I've been seeing more and more requests for a good MathJax extension lately. I don't have any use for it personally, so I'm not interested in maintaining one myself, but a good extension would be a welcome addition to the community.

mitya57 commented 9 years ago

Looks like @mayoff deleted most of the issues from his repository, including both you filed :-/

I have now pushed my code to https://github.com/mitya57/python-markdown-math and added that to the wiki. (I used just math as a name, as in future it can support other backends, such as KaTeX).

waylan commented 9 years ago

Cool. Although, third party extensions should never be copied to .../markdown/extensions/. They should be installed as any other python package at the root of your python path (for example /usr/lib/python3/dist-packages/). Preferably, a simple setup.py file would be provided. In fact this was one of the issues I reported to @mayoff.

mitya57 commented 9 years ago

@waylan Sorry for the delay in responding. I don't quite understand what you mean:

What I currently suggest in my README is copying the file to the directory where Python-Markdown was installed.

waylan commented 9 years ago

@waylan Sorry for the delay in responding. I don't quite understand what you mean:

  • Installing extensions as mdx_foo in root namespace is deprecated.

Years ago, before I was involved in the project there was no directory of extensions. At that time, the only way to include an extension was to name it mdx_somename.py and then from Python pass into Markdown the string "somename". Behind the scenes, Markdown would append the "mdx_" prefix and import the module mdx_somename which had to be installed on the root of the PYTHONPATH.

After introducing other, better ways to the API, that old way was discouraged but still supported. As of the next release (v2.6) it is Deprecated and with the release after that (v2.7) will raise an error. At that time, users will need to change the string to "mdxsomename" to avoid the error... Or the extension author will need to change the name of the extension to no longer use the "mdx" prefix. I expect that many extensions authors will take no action (meaning all their users will) but they should at least make sure their users are aware of the pending change. To avoid confusion, I discourage the use of the "mdx_" prefix altogether for new extensions. .

  • Installing a file named markdown.extensions.foo.py in root namespace is impossible.

You are correct. However, installing a file named foo.py (or yourmodule\foo.py) in the root namespace is quite simple and what I expect extensions to do.

What I currently suggest in my README is copying the file to the directory where Python-Markdown was installed.

Never ever do that. How does the user do this if Markdown was installed as an egg? What happens when Markdown is updated (will the user need to install your extension again?)? This is just generally bad Python practice. No module should install itself within the namespace of a separate module. The markdown.extensions namespace is reserved for extensions which ship with Python-Markdown only. If in the future I make a change which breaks third party extensions copied to markdown.extensions I will offer no provisions to ease users through the change. And I will never recommend an extension to others which includes this as a way to be installed.

Some recent additions to the tutorial on the wiki better explain how I expect extensions to be installed. Perhaps that will be helpful.

mitya57 commented 9 years ago

@waylan Ok, I figured it out and added a setup.py which installs a top-level mdx_math Python module. Users of new Python-Markdown versions should now add mdx_math to their list of extensions, users of older versions can just add math.

Of course I can not use mdx_ prefix at all, but I can't come up with a name that is not too generic.