Closed tovrstra closed 4 months ago
I'm quite sure this change has the side effect of eliminating any caching between runs.
If the issue is that files are being accessed concurrently, I think the more appropriate change would be to make sure all files are first written with a suffix f"{temp_filename}_tmp_{unique_nonce}"
and then do an atomic rename to temp_filename
.
I hadn't realized that caching was used in this way. I agree that this change breaks caching.
Is the goal to enable caching within one process, or also cache between different invocations?
Caching across multiple invocations / separate processes is the idea yes.
Sorry for coming back late to this.
In order to realize caching between multiple runs safely, one could use separate directories: (i) for storing cached results and (ii) for running _write_tex2html
. When _write_tex2html
completes successfully, it can copy its result to the cache directory. Typically, such cached results are stored in a subdirectory of ~/.cache/
instead of under /tmp
@tovrstra Is it possible for you to verify that my most recent changes fix the issue?
Sorry for the delay. I've repeated the build process, which involves about 1200 Markdown to PDF conversions (feedback with equations for my students). These conversions run in parallel on a machine with 16 cores. With the latest version of this pull request, I get the following error:
...
File "/.../venv/lib64/python3.12/site-packages/markdown/core.py", line 354, in convert
self.lines = prep.run(self.lines)
^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 268, in run
return list(self._iter_out_lines(lines))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 257, in _iter_out_lines
marker_tag = self._make_tag_for_inline(code.inline_text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 214, in _make_tag_for_inline
math_html = md_inline2html(inline_text, self.ext.options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 119, in md_inline2html
return tex2html(inline_text, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 83, in tex2html
result = wrapper.tex2html(tex, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.../pkgs/markdown-katex/src/markdown_katex/wrapper.py", line 281, in tex2html
_cleanup_cache_dir()
File "/.../pkgs/markdown-katex/src/markdown_katex/wrapper.py", line 288, in _cleanup_cache_dir
mtime = fpath.stat().st_mtime
^^^^^^^^^^^^
File "/usr/lib64/python3.12/pathlib.py", line 840, in stat
return os.stat(self, follow_symlinks=follow_symlinks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/mdkatex/f920f6800f2e6d5be7baf1a9de0359d6c1bf6f4b5702053ec213b12481bb111e.tex_tmp_0b5556b35d01ccd492ff252035b63855cf070838'
I cannot share this specific test case due to privacy concerns. (The traceback is also edited.) To facilitate fixing the issue, a simple example would be better. I'll see if I can cook up something that can be shared.
Perfect, I appreciate the help.
With my most recent commit, I'm reasonably sure, that your issue should be fixed. A test would of course be nice, regardless, but I'll take your word for it if you say everything is working now.
Thanks for the additional commits! These do indeed fix the issue as far as my testing goes.
I also created a Python script that reproduces the bug (before your latest commits). To see the error message, you need to remove all files under /tmp/mdkatex
before running the script:
#!/usr/bin/env python
import concurrent.futures
import random
import markdown
TEXT_SNIPPETS = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
""".split()
EQ_SNIPPETS = r"""
x y \sin(\theta) \hat{H} \vec{B} \int_0^\infty f(x)\,dx
\{i^2\}_{i=1}^N e^{2it\pi\omega} \tan(x)
""".split()
def generate_md() -> str:
parts = []
for _ in range(100):
parts.append(random.choice(TEXT_SNIPPETS))
parts.append(f"$`{random.choice(EQ_SNIPPETS)}`$")
return " ".join(parts)
def convert(md: str) -> str:
md_ctx = markdown.Markdown(
extensions=[
"fenced_code",
"markdown_katex",
"tables",
],
extension_configs={
"markdown_katex": {"insert_fonts_css": True}
},
)
return md_ctx.convert(md)
def main():
with concurrent.futures.ProcessPoolExecutor(max_workers=30) as executor:
mds = [generate_md() for _ in range(30)]
for html in executor.map(convert, mds):
print(html)
print()
if __name__ == '__main__':
main()
It may be possible to simplify the script further and still get the error.
@mbarkhau Would it be possible to make a release with this fix? With the current version of pip (24.1), markdown-katex cannot be installed, due to issues fixed in this PR. I get the following error when installing markdown-katex with the latest pip:
WARNING: Ignoring version 202112.1034 of markdown-katex since it has invalid metadata:
Requested markdown-katex from https://files.pythonhosted.org/packages/a2/18/f54ce298ddda160e9443fd68e47c2a677ea6320ddbe08e10cd40d54c2df4/markdown_katex-202112.1034-py2.py3-none-any.whl (from stepup-reprep==1.2.1->-r requirements.in (line 8)) has invalid metadata: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
Markdown (>=3.0<3.3) ; python_version < "3.6"
~~~~~~^
Please use pip<24.1 if you need to use this version.
WARNING: Ignoring version 202109.1033 of markdown-katex since it has invalid metadata:
Requested markdown-katex from https://files.pythonhosted.org/packages/1c/d8/3a38317d1ad5b3bd1428167e28a9dd353c1fffc29408500ec27f5061cf16/markdown_katex-202109.1033-py2.py3-none-any.whl (from stepup-reprep==1.2.1->-r requirements.in (line 8)) has invalid metadata: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
Markdown (>=3.0<3.3) ; python_version < "3.6"
~~~~~~^
Please use pip<24.1 if you need to use this version.
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [3 lines of output]
error in markdown-katex setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Expected end or semicolon (after version specifier)
Markdown>=3.0<3.3;python_version<"3.6"
~~~~~^
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
Thanks!
I just pushed v202406.1035
Thank you so much! I've tested it once more with the pip-installed version. It all works.
This fixes issue #16 (So far, I've not run into this error anymore with this fix.)
I also had to correct the requirements file to make
pip install -e .
work. See https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples