alvinwan / TexSoup

fault-tolerant Python3 package for searching, navigating, and modifying LaTeX documents
https://texsoup.alvinwan.com
BSD 2-Clause "Simplified" License
287 stars 43 forks source link

texsoup tabular does not support pre-defined commands #84

Closed jajupmochi closed 4 years ago

jajupmochi commented 4 years ago

Hi, I am trying to parse a code like this:

    soup = TexSoup(r'''
    \begin{document}

    \section{Hello \textit{world}.}

    \subsection{Watermelon}

    (n.) A sacred fruit. Also known as:

    \begin{itemize}
    \item red lemon
    \item life
    \end{itemize}

    Here is the prevalence of each synonym.

    \begin{tabular}{l*{12}{c}}
      \toprule
      \multirow{3}[4]{*}{Kernels} & \mc{3}{\mr{1}{Substructures} } & \mc{4}{Labeling} & \mrc{3}{-0.8ex}{Directed} & \mrc{3}{-0.8ex}{Edge\\Weighted} & \mrc{3}{-0.8ex}{Computational\\Complexity \\ (Gram Matrix)} & \mrc{3}{-0.8ex}{Explicit\\Representation}             & \mrc{3}{-0.8ex}{Weighting}                             \\
      \cmidrule(lr){2-4}\cmidrule(lr){5-8}
                                  & \mr{2}{linear} & \mr{2}{non-linear} & \mr{2}{cyclic} & \mc{2}{symbolic} & \mc{2}{non-symbolic} &   &    &       &  &         \\
      \cmidrule(lr){5-6}\cmidrule(lr){7-8}
                                   &      &     &     & vertices & edges & vertices & edges &      &      &                                  &      &          \\
      \midrule
      Common walk                  & \yes & \no & \no & \yes     & \yes  & \no      & \no   & \yes & \no  & $\bigO(N^2n^6)$                  & \no  & a priori \\
      Marginalized                 & \yes & \no & \no & \yes     & \yes  & \no      & \no   & \yes & \no  & $\bigO(N^2rn^4)$                 & \no  & \no      \\
      Sylvester equation           & \yes & \no & \no & \no      & \no   & \no      & \no   & \yes & \yes & $\bigO(N^2n^3)$                  & \no  & a priori \\
      Conjugate gradient           & \yes & \no & \no & \yes     & \yes  & \yes     & \yes  & \yes & \yes & $\bigO(N^2rn^4)$                 & \no  & a priori \\
      Fixed-point iterations       & \yes & \no & \no & \yes     & \yes  & \yes     & \yes  & \yes & \yes & $\bigO(N^2rn^4)$                 & \no  & a priori \\
      Spectral decomposition       & \yes & \no & \no & \no      & \no   & \no      & \no   & \yes & \yes & $\bigO(N^2n^2+Nn^3)$             & \no  & a priori \\
      Shortest path                & \yes & \no & \no & \yes     & \no   & \yes     & \no   & \yes & \yes & $\bigO(N^2n^4)$                  & \no  & \no      \\
      Structural shortest path     & \yes & \no & \no & \yes     & \yes  & \yes     & \yes  & \yes & \no  & $\bigO(h N^2n^4 + N^2nm))$ & \no  & \no      \\
      Path kernel up to length $h$ & \yes & \no & \no & \yes     & \yes  & \no      & \no   & \yes & \no  & $\bigO(N^2h^2n^2d^{2h})$                  & \yes & \yes     \\ 
      \rowcolor{gray!10}
      Treelet  & \yes & \yes & \no & \yes     & \yes  & \no      & \no   & \yes & \no  & $\bigO(N^2nd^{5})$                  & \yes & \yes     \\
      \rowcolor{gray!10}
      Weisfeiler-Lehman (WL) subtree                  & \yes & \yes & \no & \yes     & \no  & \no      & \no   & \yes & \no  & $\bigO(Nhm+N^2hn)$                  & \yes & \no     \\
      \bottomrule
    \end{tabular}

    \end{document}
    ''')
    for child in soup.tabular.contents:
        print(child)
        print('--------------')

However the tabular is not parsed line by line but into pieces like this:

\toprule
--------------
\multirow{3}[4]{*}{Kernels}
--------------
 & 
--------------
\mc{3}{\mr{1}{Substructures} }
--------------
 & 
--------------
\mc{4}{Labeling}
--------------
 & 
--------------
\mrc{3}{-0.8ex}{Directed}
--------------
 & 
--------------
\mrc{3}{-0.8ex}{Edge\\Weighted}
--------------
 & 
--------------
\mrc{3}{-0.8ex}{Computational\\Complexity \\ (Gram Matrix)}
--------------
 & 
--------------
\mrc{3}{-0.8ex}{Explicit\\Representation}
--------------
             & 
--------------
\mrc{3}{-0.8ex}{Weighting}
--------------
                             \\

--------------
\cmidrule
--------------
(lr)
--------------
{2-4}

Is it because it does not support pre-defined commands (like \mc) or did I do it wrong? How should I fix it? Thanks!

alvinwan commented 4 years ago

@jajupmochi Thanks for reporting -- seems like you've realized but TexSoup doesn't currently support pre-defined commands. There was some discussion with another contributor at #59 about generally hard-coding different commands and their number of optional/required arguments. Supporting pre-defined commands would move us one step closer.

I'll mark this as an enhancement for now and perhaps revisit at a later date.

alvinwan commented 4 years ago

(Closing, as custom commands are now supported -- not officially documented yet, but the command parser now considers the command's number of required and optional args)