justinvh / Markdown-LaTeX

This Markdown extension adds inline LaTeX support without the need for external images.
Other
126 stars 15 forks source link

best way to escape dollar sign? #7

Open 404pnf opened 12 years ago

404pnf commented 12 years ago

I am typesetting articles on pricing and economy. There are dollar signs with normal text. If I don't escape them, python would complain about

  UnicodeEncodeError: 'ascii' codec can't encode character u'\ufb00'

or something like that, different u'\' characters. It's really misleading! I thought there were some characters that python can't handle or the file is encoded with BOM.

Meanwhile the latex tmp.log stops after \usepackage then \end{article}, nothing within

Until by some chance I got to see a real latex tmp log with:

  ! Missing $ inserted.
 <inserted text> 
              $

I knew it's something within equation or some missing $ . Of course, there are dollar signs all over the article.

Here is a sample:

  ## DEFINITION

  $ \displaystyle Value_a - Price_a \geq Value_b - Price_b $

  Rearranging gives:

  $ \displaystyle Price_a \leq (Value_a - Value_b) + Price_a Price_b \leq Differentiation Value_b + Price_b $

  - This suggests that the EVC of a server with the software is  $6,000 + $6,800 = $12,800. 

  - item 2 is  $6,000 + $6,800 = $12,800. 

Currently my way of dealing with this is to manually escape each $dollars. Because % $6,000 + $6,800 = $12,800. % inline won't work

My question is what's the best the to deal with it?

Thanks in advance!

404pnf commented 12 years ago

escape the dollar sign with \$ aviods it to be processed by latex by the html file looks strange

  <li>This suggests that the EVC of a server with the software is \$6,000 + \$6,800 = \$12,800. </li>
justinvh commented 12 years ago

You're welcome to change the script's RE to be something like [ ] or ( ), etc.

404pnf commented 12 years ago

Using \[ \] is a possible solution but it brings other difficulties. Considering the following markown

 \[fd\]

  \(fasdf\)

render it, markdown testmarkdown.md, we get

[fd]

(fasdf)

Markdown strips the backslash. It's the correct behavior. Since any markdown extension runs after markdown itself, we need some way to ask markdown engine not to touch \[ \].

Other markdown with latex solutions do use \[ \] and ask the writer to put maths in code block or inline codeget around the above-metioned trouble.

E.g

    inline math ` \[ some math \]`

             \[use indent to 
               adfa \]

This is not good sematic. And I tried those approaches, not as natrual as your current solution.

I propose a minor code change. Considering the facts

  1. markdown runs first, markdown-latex runs after markdown
  2. markdown won't strip \ from \$ and \%

Ask markdown-latex to subsitute any \$ and \% outside math mode to $ and %. Must be outside math mode because in math mode one need to escape $ and % to get themselves,

   $ E_d = {\text{\% change in quantity demanded} \over \text{\% change in price} } = $

How about ask the writer upfront to escape '%' and $ in their documents. This will add some effort but it worthes it, because 1) it's stated beforhand rahter than authors go through what I had been through 2) people are used to escape special characters, and there are only two reserved words here. 3) a unified rule is easier for people to cooprate and exchange document

justinvh commented 12 years ago

Markdown strips the backslash. It's the correct behavior. Since any markdown extension runs after markdown itself, we need some way to ask markdown engine not to touch [ ].

This isn't true. The plugin is both a preprocessor and postprocessor. I changed the regular expression to search for [ ] and it worked fine. I output the preprocess result and the postprocess:

Regular expression for a small TeX mode:

TEX_MODE = re.compile(r'(?=(?<!\\)\\\[).(.+?)(?<!\\)\\\]', re.MULTILINE | re.DOTALL) 

Running a small test:

⇝ markdown_py -x latex tests/simple.markdown
This is a simple test: \[y = mx + b\]. This should all be inlined.
<style>img.latex-inline { vertical-align: middle; }</style>
<p>This is a simple test: <img class='latex-inline math-false' alt='ymxb' id='ymxb'     src='data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFIAAAAOBAMAAABOTlYkAAAAMFBMVEX///8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv3aB7AAAAD3RSTlMAVGaZ3SLvMnariUTNELswgTKkAAAA6UlEQVQoz2NgIAAawOQ+BwaCYAKY5JhArEoeAWJV8l9gYGBUZmBBltGSlpigZMjAoDhdqwGhcpuiKUNgGEMZkDnvHRAYMDCwF6UyODPsYWBiWMOMpNKGQZMhoJNhMZKRrJwLGL4ydDKwMrwA8jjfvct79xjIcGRoYWDwZjiIbDtbAbcDwyMg4zuyO/8yzGTg/QCkELYzxDNwGfD+DWBgT+C9AFfJmsDwG0gwoYSqHgOLANODhj3MB5gQZvI6cAFNWSGzAFmlITAoWA0v7FRfpItku4YG2GUB+MLTAM66X7CKgTjAJSFAnEIASh86gTnFaqsAAAAASUVORK5CYII='>. This should all be inlined.</p>%
justinvh commented 12 years ago

Anyhow, the best solution, as you stated, is to just escape the % and $ if you use them in a document.

404pnf commented 12 years ago

Would you implement this in your code?