Closed odedp closed 2 years ago
Thanks for reporting the problem @odedp I'd really appreciate it if someone could find the time to debug this issue, otherwise I'll try to get to it at some point.
I'm seeing this issue too with my own paper. Has anyone found a workaround?
I did a pseudo-binary search to find the problematic command and manually removed it.
I played around with just that command, trying to find a minimal example. I found that this version allows arxiv_latex_cleaner to complete but slows it down by 10x or so:
\test{Alternative: $\Psi \{\psi_{\pi_{SF}}\}$}
And this version does appear to make it hang indefinitely:
\test{Alternative: $\Psi \cup \{\psi_{\pi_{SF}}\}$}
I'm running
python3 -m arxiv_latex_cleaner <folder_name> --commands_to_delete test
with Python 3.8.2.
I'm able to reproduce the issue with the --commands_only_to_delete
option when I have a custom command around an equation environment.
Something like this, where test
is the environment.
\test{
\begin{equation}\label{eq:conversion}
\begin{split}
a &= \frac{b}{\mathrm{c_{t^2}}}\\
c &= d
\end{split}
\end{equation}.
}
}
If I make the equation simpler by reducing the levels of nested curly braces, then the code doesn't cause a hang. So, to me it seems like it may have something to do with deep levels of nesting within the custom command?
Hi all
I ran into this problem as well. After some debugging I found the following:
arxiv_latex_cleaner.py
when using the regex pattern built on Line 106re
module does not support recursive patterns. However, the third-party regex
module (see here) is a drop-in replacement for re
and implements the recursive subroutine used in ths pattern.regex
module, which seems to work correctly and does not hangIf @jponttuset can confirm that the additional external dependency on the regex
module is acceptable I am happy to create a pull request for this (very simple) bugfix.
[1] Line 106 of arxiv_latex_cleaner.py
: base_pattern = r'\\' + command + r'{(?:[^}{]+|{(?:[^}{]+|{[^}{]*})*})*}
, fails, Demo on regex101
[2] Correct pattern: base_pattern = r'\\' + command + r'\{((?:[^{}]+|\{(?1)\})*)\}'
works, Demo on regex101 based on this stackoverflow answer
Thanks so much @joellindegger for the investigation!
Adding regex
is perfectly fine, it'd be great if you could send a PR.
Here's a minimal example. If my source file includes:
When running:
It seems to hang forever on the above file. In my attempts, removing any of the
todo1
,todo2
,figure
, oremph
seems to make the problem go away...