mbarkhau / markdown-katex

Adds KaTeX support for Python Markdown
MIT License
16 stars 10 forks source link

Fix No such file or directory: `/tmp/mdkatex/...` #17

Closed tovrstra closed 4 months ago

tovrstra commented 8 months ago

This fixes issue #16 (So far, I've not run into this error anymore with this fix.)

I also had to correct the requirements file to make pip install -e . work. See https://pip.pypa.io/en/stable/reference/requirement-specifiers/#examples

mbarkhau commented 8 months ago

I'm quite sure this change has the side effect of eliminating any caching between runs.

If the issue is that files are being accessed concurrently, I think the more appropriate change would be to make sure all files are first written with a suffix f"{temp_filename}_tmp_{unique_nonce}" and then do an atomic rename to temp_filename.

tovrstra commented 8 months ago

I hadn't realized that caching was used in this way. I agree that this change breaks caching.

Is the goal to enable caching within one process, or also cache between different invocations?

mbarkhau commented 8 months ago

Caching across multiple invocations / separate processes is the idea yes.

tovrstra commented 7 months ago

Sorry for coming back late to this.

In order to realize caching between multiple runs safely, one could use separate directories: (i) for storing cached results and (ii) for running _write_tex2html. When _write_tex2html completes successfully, it can copy its result to the cache directory. Typically, such cached results are stored in a subdirectory of ~/.cache/ instead of under /tmp

mbarkhau commented 7 months ago

@tovrstra Is it possible for you to verify that my most recent changes fix the issue?

tovrstra commented 6 months ago

Sorry for the delay. I've repeated the build process, which involves about 1200 Markdown to PDF conversions (feedback with equations for my students). These conversions run in parallel on a machine with 16 cores. With the latest version of this pull request, I get the following error:

...
  File "/.../venv/lib64/python3.12/site-packages/markdown/core.py", line 354, in convert
    self.lines = prep.run(self.lines)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 268, in run
    return list(self._iter_out_lines(lines))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 257, in _iter_out_lines
    marker_tag = self._make_tag_for_inline(code.inline_text)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 214, in _make_tag_for_inline
    math_html = md_inline2html(inline_text, self.ext.options)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 119, in md_inline2html
    return tex2html(inline_text, options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/extension.py", line 83, in tex2html
    result = wrapper.tex2html(tex, options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pkgs/markdown-katex/src/markdown_katex/wrapper.py", line 281, in tex2html
    _cleanup_cache_dir()
  File "/.../pkgs/markdown-katex/src/markdown_katex/wrapper.py", line 288, in _cleanup_cache_dir
    mtime = fpath.stat().st_mtime
            ^^^^^^^^^^^^
  File "/usr/lib64/python3.12/pathlib.py", line 840, in stat
    return os.stat(self, follow_symlinks=follow_symlinks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/mdkatex/f920f6800f2e6d5be7baf1a9de0359d6c1bf6f4b5702053ec213b12481bb111e.tex_tmp_0b5556b35d01ccd492ff252035b63855cf070838'

I cannot share this specific test case due to privacy concerns. (The traceback is also edited.) To facilitate fixing the issue, a simple example would be better. I'll see if I can cook up something that can be shared.

mbarkhau commented 6 months ago

Perfect, I appreciate the help.

mbarkhau commented 6 months ago

With my most recent commit, I'm reasonably sure, that your issue should be fixed. A test would of course be nice, regardless, but I'll take your word for it if you say everything is working now.

tovrstra commented 6 months ago

Thanks for the additional commits! These do indeed fix the issue as far as my testing goes.

I also created a Python script that reproduces the bug (before your latest commits). To see the error message, you need to remove all files under /tmp/mdkatex before running the script:

#!/usr/bin/env python

import concurrent.futures
import random

import markdown

TEXT_SNIPPETS = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
""".split()

EQ_SNIPPETS = r"""
x y \sin(\theta) \hat{H} \vec{B} \int_0^\infty f(x)\,dx
\{i^2\}_{i=1}^N e^{2it\pi\omega} \tan(x)
""".split()

def generate_md() -> str:
    parts = []
    for _ in range(100):
        parts.append(random.choice(TEXT_SNIPPETS))
        parts.append(f"$`{random.choice(EQ_SNIPPETS)}`$")
    return " ".join(parts)

def convert(md: str) -> str:
    md_ctx = markdown.Markdown(
        extensions=[
            "fenced_code",
            "markdown_katex",
            "tables",
        ],
        extension_configs={
            "markdown_katex": {"insert_fonts_css": True}
        },
    )
    return md_ctx.convert(md)

def main():
    with concurrent.futures.ProcessPoolExecutor(max_workers=30) as executor:
        mds = [generate_md() for _ in range(30)]
        for html in executor.map(convert, mds):
            print(html)
            print()

if __name__ == '__main__':
    main()

It may be possible to simplify the script further and still get the error.

tovrstra commented 4 months ago

@mbarkhau Would it be possible to make a release with this fix? With the current version of pip (24.1), markdown-katex cannot be installed, due to issues fixed in this PR. I get the following error when installing markdown-katex with the latest pip:

  WARNING: Ignoring version 202112.1034 of markdown-katex since it has invalid metadata:
  Requested markdown-katex from https://files.pythonhosted.org/packages/a2/18/f54ce298ddda160e9443fd68e47c2a677ea6320ddbe08e10cd40d54c2df4/markdown_katex-202112.1034-py2.py3-none-any.whl (from stepup-reprep==1.2.1->-r requirements.in (line 8)) has invalid metadata: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
      Markdown (>=3.0<3.3) ; python_version < "3.6"
               ~~~~~~^
  Please use pip<24.1 if you need to use this version.
  WARNING: Ignoring version 202109.1033 of markdown-katex since it has invalid metadata:
  Requested markdown-katex from https://files.pythonhosted.org/packages/1c/d8/3a38317d1ad5b3bd1428167e28a9dd353c1fffc29408500ec27f5061cf16/markdown_katex-202109.1033-py2.py3-none-any.whl (from stepup-reprep==1.2.1->-r requirements.in (line 8)) has invalid metadata: Expected matching RIGHT_PARENTHESIS for LEFT_PARENTHESIS, after version specifier
      Markdown (>=3.0<3.3) ; python_version < "3.6"
               ~~~~~~^
  Please use pip<24.1 if you need to use this version.
    error: subprocess-exited-with-error

    × python setup.py egg_info did not run successfully.
    │ exit code: 1
    ╰─> [3 lines of output]
        error in markdown-katex setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers; Expected end or semicolon (after version specifier)
            Markdown>=3.0<3.3;python_version<"3.6"
                    ~~~~~^
        [end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip.

Thanks!

mbarkhau commented 4 months ago

I just pushed v202406.1035

tovrstra commented 4 months ago

Thank you so much! I've tested it once more with the pip-installed version. It all works.