DRMF / texvcjs

A LaTeX validator/translator for TeX strings embedded in wikitext
0 stars 2 forks source link

Add parsing for double backslashes #47

Open notjagan opened 8 years ago

notjagan commented 8 years ago

As of right now, the pegJS code does not handle double backslashes within LaTeX, which should act similar to carriage returns.

physikerwelt commented 8 years ago

@notjagan thank you for using github issues to record this problem

HowardCohl commented 8 years ago

Although this is a nice comment, I don't think this moves us any closer to a solution.

Apparently @ClaudeZou doesn't know how to fix this. Also, @notjagan doesn't know how to fix this.

@physikerwelt Do you have any ideas how to fix this?

physikerwelt commented 8 years ago

I don't think this moves us any closer to a solution

That's the next step. First, we should be more formal and create a test case. Please provide a test tuple with input, and expected output.

notjagan commented 8 years ago

Example input: \frac12\\2

Expected output: \frac{1}{2}\\2

Of course, at this point the pegJS code would also consume the whitespaces such as the newline in the expression, but as of now the issue is that it doesn't accept the double backslash. I will also create a test case for this in the all.js test cases once I add more test cases for parseSty.js.

physikerwelt commented 8 years ago

@notjagan Can you confim that the current results is S?

physikerwelt commented 8 years ago

Note, that you can parse for example

  \begin{pmatrix} 
    a_{11} & a_{12} & a_{13} \\ 
    a_{21} & a_{22} & a_{23} 
  \end{pmatrix} 
physikerwelt commented 8 years ago

@notjagan I have to admit that I do not really understand the example. See the rendering of your example in standard latex.

HowardCohl commented 8 years ago

Here's another example

\begin{cases}
\dfrac{\gamma^2}{4m^2-1}, &  \mbox{$n-m$ even},
\\
- \dfrac{\gamma^2}{(2m-1)(2m-3)}, & \mbox{$n-m$ odd}.
\end{cases}
notjagan commented 8 years ago

@physikerwelt Running \frac12\\2 returns S, whereas \frac12 2 returns the expected output. Oddly enough, however, double backslashes do seem to work within pmatrices, so I'm not quite sure what's happening.

physikerwelt commented 8 years ago

@notjagan I think texvc works as expected. The grammar allows \\ where it does make sense https://github.com/wikimedia/texvcjs/blob/master/lib/parser.pegjs#L300

HowardCohl commented 8 years ago

We have to fix cases in texvcjs.

Does it support for eqnarray?

notjagan commented 8 years ago

However, there are examples where the grammar does not allow \\ where it seems it should. I have recently been working in texer with KLSadd.tex, which contains usages of \\ that are not accepted by the grammar. For example, the math within

\begin{multline*}
F_4\left(a,b;b,b;\frac{-x}{(1-x)(1-y)},\frac{-y}{(1-x)(1-y)}\right)\\
=\left(\frac{(1-x)(1-y)}{1+xy}\right)^a\,
\hyp21{\thalf a,\thalf(a+1)}b{\frac{4xy}{(1+xy)^2}}.
\end{multline*}

is not accepted because of the double backslash.

HowardCohl commented 8 years ago

@physikerwelt In LaTeX, there are several ways to enter math mode, and depending on how you entered math mode, you get a different rendered output.

Such as \begin{multline}...\end{multline} vs. \begin{equation}...\end{equation}. If you are in {multline} environment, then \\ does something, but in {equation} environment, then it does nothing. So in this case, the \\ does make sense. So texvcjs should not fail.

physikerwelt commented 8 years ago

Historically $$ was used for rendering. Thus, you had to use a macro such as \textstysle to get inline rendering. However, we could add a parameter to pass the rendering mode to texvcjs.

On Aug 9, 2016 9:19 PM, "HowardCohl" notifications@github.com wrote:

@physikerwelt https://github.com/physikerwelt In LaTeX, there are several ways to enter math mode, and depending on how you entered math mode, you get a different rendered option.

Such as \begin{multline}...\end{multline} vs. \begin{equation}...\end{equation}. If you are in {multline} environment, then \ does something, but in {equation} environment, then it does nothing. So in this case, the \ does make sense. So texvcjs should not fail.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DRMF/texvcjs/issues/47#issuecomment-238661204, or mute the thread https://github.com/notifications/unsubscribe-auth/ACpiiDw89kQP_bJgABLBMQ6sC1fqve7cks5qeNKngaJpZM4JgGa3 .

notjagan commented 8 years ago

@physikerwelt In texer, where I currently have a prototype of parsing text mode, I already have code that I can transfer over that can easily pass the rendering mode. However, I'm not sure where to accept the parameter nor how to parse the double backslashes. As I see it, the steps to completing this issue are as follows:

physikerwelt commented 8 years ago

I would take a different approach.

physikerwelt commented 8 years ago

@HowardCohl I deleted my comments above that were incorrect. The {cases} environment works. See https://en.wikipedia.org/wiki/User:Physikerwelt/cases

HowardCohl commented 8 years ago

@notjagan especially {align} is relevant here!

The {align} environment is a way of entering math mode, exactly like {multline}. I think this is where to search for implementing {multline}.

HowardCohl commented 8 years ago

@physikerwelt Although, something is very strange. In LaTeX, \begin{align} command is only encountered in text mode. However, in texvc, \begin{align} is encountered in math mode, i.e. after <math> is encountered.

notjagan commented 8 years ago

It turns out that double backslashes should be accepted in any environment, but are simply ignored during rendering in some. At the moment, the parser accepts double backslashes in begin{align} (also in multline and multline* in #48), but it needs to be accepted everywhere. Additionally, there needs to be support for an optional square bracket argument (e.g. \\[2cm]), but this is not as big of a change and is not difficult once double backslashes are universal.

notjagan commented 8 years ago

@physikerwelt Do you have any ideas on how double backslashes should be made environment-independent? Something similar to the current system for multline is possible to implement, but I'm not sure what method would fit best into the pegJS as it is now.

HowardCohl commented 8 years ago

Also we need to figure out how to make the double backslash complete, namely \\[*][extra-space] where the allowed spacing commands can be found at https://en.wikibooks.org/wiki/LaTeX/Lengths

physikerwelt commented 8 years ago

We need to get back to a fact based and focussed discussion. As a first step we should collect some test cases. At the moment I didn't see a convincing example that does not work on https://en.wikipedia.org/wiki/User:Physikerwelt/cases ;-) Howards proposal \\[*][extra-space] is a nice and well defined extension that could be implemented based on the current codebase. But I think we should create a seperate issue for that.

HowardCohl commented 8 years ago

This is really what a double backslash is ... adding the parameters for the double backslash command.

physikerwelt commented 8 years ago

I tried \\[2cm] and it seems to work fine see https://en.wikipedia.org/w/index.php?title=User%3APhysikerwelt%2Fcases&type=revision&diff=734812510&oldid=734810713

HowardCohl commented 8 years ago

@notjagan What do you think about @physikerwelt 's comment above?

notjagan commented 8 years ago

@physikerwelt I was mistaken when I thought that \\[ didn't work. The issue is actually not that \\[ causes an error, but that \\ causes an error without the inclusion of begin{matrix} or something similar. I want to try to make \\ universal, but I want to know beforehand if there could be any adverse effects that I did not foresee.

notjagan commented 8 years ago

@physikerwelt For example, this can be a problem with using other_literals3. If you assign something to be \\[2in] or anything of that capacity, the pegJS tries to validate what it is assigned to. However, since that doesn't have an environment like \begin{align} around it, the parser thinks \\[2in] is invalid and produces an error.

HowardCohl commented 8 years ago

@physikerwelt @notjagan reworded above to mention the produced error.

physikerwelt commented 8 years ago

@notjagan as far as I understand your last comment that's the correct and expected behaviour. Could you please create an example that works in regular latex but not in texvc?

HowardCohl commented 8 years ago

@physikerwelt It's not the correct behavior. In this case, if you are in an {align} environment, when it encounters the double backslash, the pegjs produces an error.

HowardCohl commented 8 years ago

@physikerwelt However, in this case, the double backslash is produced by a \newcommand and the pegjs at that stage doesn't realize it's in a {align} environment.

notjagan commented 8 years ago

@physikerwelt A stronger example is the following:

<math>
\frac12\\
\frac21
</math>

In LaTeX, this is accepted, although the \\ doesn't render. However, this does not work in texvc.

notjagan commented 8 years ago

A case of this can be seen on https://en.wikipedia.org/wiki/User:Physikerwelt/cases.

physikerwelt commented 8 years ago

@HowardCohl I don't understand what you mean by newcommand? texvc does not support newcommands by design.

@notjagan the fact that \\ does not do anything in the $$ math mode does not convince me. I think here the texvcjs behaviour is better, because it clearly indicates that something is wrong with the expression. The link you provide below does not work.

I think we should close this issue.

HowardCohl commented 8 years ago

@physikerwelt Well, you need to see his texer implementation of newcommand to understand this then. This issue should be closed I suppose and reopened in the texer repo.

physikerwelt commented 8 years ago

I still don't understand/see a problem. Maybe we can discuss that on the phone.