dbmrq / vim-bucky

:leaves: Ventilated prose
8 stars 1 forks source link

Indent display equation within a sentence #4

Closed kiryph closed 6 years ago

kiryph commented 6 years ago

Equations are part of sentences. For example, see this screenshot from Feller on Probability Theory:

screen shot 2018-08-27 at 15 19 23

This is true for all books, articles, ... which use equations.

The tex source code could look like

Thus
   \begin{equation*}
     P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.
   \end{equation*}

The probability $P\{A_1 \cup A_2\cup \cdots A_n\}$ of the realization of at
   least one among $n$ events can be computed by a formula analogous to (7.4),
   derived in IV,1.
Here we note only that the argument leading to (7.3) applies to any number of
   terms.
Thus \emph{for arbitrary events $A_1, A_2, \ldots$ the inequality}
   \begin{equation}
     P\{A_1 \cup A_2\cup \cdots \} \leq P\{A_1}+P\{A_2}+\ldots
   \end{equation}
   holds.
In the special case where the events $A_1, A_2, \ldots$ are mutually

Current Behavior

Thus
\begin{equation*}
   P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.
\end{equation*}

The probability $P\{A_1 \cup A_2\cup \cdots A_n\}$ of the realization of at
   least one among $n$  events can be computed by a formula analogous to (7.4),
   derived in IV,1.
Here we note only that the argument leading to (7.3) applies to any number of
   terms.
Thus \emph{for arbitrary events $A_1, A_2, \ldots$ the inequality}
\begin{equation}
   P\{A_1 \cup A_2\cup \cdots \} \leq P\{A_1}+P\{A_2}+\ldots
\end{equation}
holds.
In the special case where the events $A_1, A_2, \ldots$ are mutually

Any chance, vim-bucky could also handle these? I guess the more difficult one is when the period is within the math environment.

dbmrq commented 6 years ago

There you go.

I actually study philosophy, I rarely use math in my documents, so I didn't consider those cases. Let me know if there's anything else. :)

kiryph commented 6 years ago

Also thanks for this update.

Sorry to bother you with something you do not need.

I noticed following: As long as the \begin{}/\end{} lines are already on separate lines it works perfectly. However, if you have an arbitrarily formatted tex code, e.g. join lines of each paragraph withh vipJ, they are not moved to separate lines.

I do not need this (i.e. no issue from me): when I type them, I always place them on separate lines. From a standpoint that the formatexpr should reach the same state from different starting points, one could argue this is missing. latexindent.pl is here quite good and has respective options. Help section 6.6 modifyLineBreaks for environments:

modifyLineBreaks:
    environments:
        BeginStartsOnOwnLine: 1
        BodyStartsOnOwnLine: 1
        EndStartsOnOwnLine: 1
        EndFinishesWithLineBreak: 1
        equation*:
            BeginStartsOnOwnLine: 1
            BodyStartsOnOwnLine: 1
            EndStartsOnOwnLine: 1
            EndFinishesWithLineBreak: 1
dbmrq commented 6 years ago

Ok, I think it's woking properly now.

The only detail is that, when there's text after the \end command and that text ends a sentence, it's kept on the same line:

Thus 
  \begin{equation*}
    P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}. 
  \end{equation*}

The probability $P\{A_1 \cup A_2\cup \cdots A_n\}$ of the realization of at
  least one among $n$ events can be computed by a formula analogous to (7.4),
  derived in IV,1.
Here we note only that the argument leading to (7.3) applies to any number of
  terms.
Thus \emph{for arbitrary events $A_1, A_2, \ldots$ the inequality} 
  \begin{equation}
    P\{A_1 \cup A_2\cup \cdots \} \leq P\{A_1}+P\{A_2}+\ldots 
  \end{equation} holds.
In the special case where the events $A_1, A_2, \ldots$ are mutually

You can see that at the line that says \end{equation} holds..

I made it like this because I use a lot of inline enumerations. So I often do something like this:

The author says three things:
\begin{enumerate*}
  \item number one,
  \item number two and
  \item number three
\end{enumerate*}.

To produce something like this:

The author says three things: (1) number one, (2) number two and (3) number three.

The period is not part of the last item, but it ends the whole sentence, so I like to put it after the environment ends. At the same time, I wouldn't want a line with a single period in it. So the current version leaves the text after the \end command where it is.

I think this is fine for now. If somebody else shows up and has any strong opinions against it, I'll see about adding an option to customize this behavior.

kiryph commented 6 years ago

What do you do if your sentence does not end after your inline list?

The author says three things: (1) number one, (2) number two and (3) number three which are all good points.

I would format this as following:

The author says three things:
  \begin{enumerate*}
    \item number one,
    \item number two and
    \item number three
  \end{enumerate*}
  which are all good points.

What about adding a special case for a single period?

dbmrq commented 6 years ago

When there's text after the list it's usually very short, so most of the time I end up leaving it on the same line. Otherwise I just break it manually, and in those cases the plugin wouldn't join it back, so it was fine.

However, the behavior you described does make more sense for most people. So I just pushed an update and the text after the environment will only stay on the same line if there aren't any alphabetic characters in it.

Also I think I don't like to have those environments indented. Since we started indenting the equation environment it would make sense to indent something like enumerate* too, but it just feels weird to me. Instead of sticking to the strict definition of a sentence in the resulting document, I think the indentation should focus more on something like "measurable semantic units". A new environment is a new block of code, so it feels to me like it starts a new unit and shouldn't be indented as if it were just any other part of a sentence.

I'm not 100% sure about this though. Especially for something small like the equation in your example, it does seem to make sense to indente the environment.

So for now I decided to take the safest route and create an option for this: g:bucky_sentence_environments.

Users should populate this variable with the environments they want to be indented in the middle of sentences separated by \|. So an example would be let g:bucky_sentence_environments = 'equation\|enumerate'.

Let me know what you think. :)

kiryph commented 6 years ago

A plugin which is used for formatting code might need a few options to fulfil different tastes 😄. Of course, keeping the number of options small is very advisable.

Linewise yank (with custom text-objects)

One reason why I place begin/end on separate lines is that I can yank the equation linewise more easily. It would also be possible to yank characterwise, most conveniently with a dedicated text objects e.g. provided by vimtex yae or vim-sandwich yase. Since I prefer to yank an environment linewise, I help myself with Vaey. Unfortunately, vim-sandwich switches back to characterwise visual mode when typing Vase 😒.

Not all plugins which define latex text object for environments are doing it the same way as vimtex or vim-sandwich and do it actually as I prefer it:

Should environments within a sentence be indented?

I see your point regarding 'measurable semantic units' and new block. However, I would also argue your inline lists should not be too long. If the individual items become long, I would consider starting a new sentence for each item. Otherwise the sentence might become as long as a single paragraph.

The author has provided three reasons which will be discussed in the following individually. (i) first reason with discussion which consists of several sentences. (ii) second (also several sentences) (iii) third (also several sentences)

I like the new option 😄 which allows the writer/user to decide how to handle this.

Also thanks for your interest in and time making your plugin useful for others as well 💯x 👍.

kiryph commented 6 years ago

An issue occured when there is whitespace between the period and the \end{}:

Thus \begin{equation*} P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}. \end{equation*}

Thus \begin{equation*} P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.\end{equation*}

This will be formatted to

Thus 
  \begin{equation*}

    Thus 
      \begin{equation*}
        P\{A_1 \cup A_2\} =
          \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.
      \end{equation*}
                P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.

  \end{equation*}

Thus 
  \begin{equation*}
    P\{A_1 \cup A_2\} = \frac{1}{2}+\frac{1}{2}-\frac{1}{4}=\frac{3}{4}.
  \end{equation*}
dbmrq commented 6 years ago

I think it's fixed now. :)

Thank you for your help. It's really difficult to consider all possible use cases, I always type the \begin and \end commands on their own line anyway, so I never noticed any of this.

kiryph commented 6 years ago

Yes, this fixed my last example.

However, I have two more situations which do not format accordingly (sorry):

1.

Xxx xxx xxxxxxxxxxxx xxxx, xxx xxxxxxxxxxxx xxxxxxxx xx xxx \emph{xxxxx'x xxx}
  xxx xxxxxxxxxx xxxxx xx $E$ xxxxxxx xxxxxxxxxx
  \begin{equation}
    E=mc^2
  \end{equation}
  xxxx xxx xxxxxxxxx \xxxx{xxxxxx xxxxxx} $\gls{xxxxxx_xxxx}_{xx}$, xxx
\emph{xxxxxxxxx/xxxxxxxxxx xxxxxx} $\gls{xxxxxxxxx_xxxx}_{xxxx}$, xxx xxxxxxx.
%
Xxx xxx xxxxxxxxxxxx xxxx, xxx xxxxxxxxxxxx xxxxxxxx xx xxx \emph{xxxxx'x xxx}
  xxx xxxxxxxxxx xxxxx xx $E$ xxxxxxx xxxxxxxxxx xxxx xxx xxxxxxxxx
  \emph{xxxxxxxxx/xxxxxxxxxx xxxxxx} $\gls{xxxxxxxxx_xxxx}_{xxxx}$, xxx xxxxxxx.

% vim tw=80

IMHO this is weird. I was suspecting that the \emph{} at the beginning of the line would be the issue. But after removing the equation \emph{} is correctly indented (see second sentence).

  1. Command at the beginning of a sentence:
    
    \Cref{xxxx} xxxxxxxxxxxx xxxx, xxx xxxxxxxxxxxx xxxxxxxx xx xxx \emph{xxxxx'x}
    xxx xxxxxxxxxx xxxxx xx $E$ xxxxxxx xxxxxxxxxx xxxx xxx xxxxxxxxx
    \cref{xxxxxxxxx/xxxxxxxxxx xxxxxx} $\gls{xxxxxxxxx_xxxx}_{xxxx}$, xxx xxxxxxx.

% vim tw=80


The package cleveref provides the command `\Cref{}` specifically to use at the beginning of a sentence:

> As it is very difficult for LATEX to determine whether a cross-reference appears at the beginning of a sentence or not, a beginning-of-sentence variant exists: `\Cref{⟨label⟩}`. By default, this typesets the cross-reference with the first letter capitalised, and without using an abbreviation in those cases where the standard variant does use one.

However, others might occur as well, e.g. `\emph{}`.
dbmrq commented 6 years ago

Thanks! I fixed one thing and broke another. I think it's ok now.

It would be a good idea to create a file with many different examples showing the expected behavior. Then we could write a script to copy the reference file, format the copy and compare it with the original to see if everything is still working as expected after each update. If you're want to get started with that, feel free to open a new issue to keep track of those examples.

kiryph commented 6 years ago

Thanks for the fix. Yes, both situations are now formatted as expected.

I agree a formatting tool should use some kind of unit testing.

Vimtex uses for example vader.vim for its formatexpr and its other functionality: have a look at its test file for the formatexpr: https://github.com/lervag/vimtex/blob/master/test/vader/format.vader

To run the tests automatically, there is a file run and a minivimrc.

Alternatives to vader.vim

I do not have experience with any of these. Vader.vim seems to be the most popular one.

BTW https://github.com/cmhughes/latexindent.pl has done this as well. This seems to be very comprehensive.

dbmrq commented 6 years ago

Wow, latexindent's tests might actually be a little too comprehensive, haha. But they'll definitely help. Thanks.

I've heard of Vader, but in this case I don't think it has many advantages over a simple script. Just using a shell script would actually make it easier to add a continuous integration server like Travis. I'll look into that when I have some time. :)