JessicaTegner / pypandoc

Thin wrapper for "pandoc" (MIT)
http://pypi.python.org/pypi/pypandoc/
Other
862 stars 109 forks source link

testing fails on unix where python is not available (but only python3) #276

Open kloczek opened 2 years ago

kloczek commented 2 years ago

I'm trying to package your module as an rpm package. So I'm using the typical PEP517 based build, install and test cycle used on building packages from non-root account.

Here is pytest output:

+ PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.8-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.8-2.fc35.x86_64/usr/lib/python3.8/site-packages
+ /usr/bin/pytest -ra
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.8.13, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.8
collected 0 items

========================================================================== no tests ran in 0.01s ===========================================================================
JessicaTegner commented 2 years ago

TOtally honest here, I don't know much about rpm packages.

How do you get the new release? Through pip right? Do you clone the repo, or use the official release through pypi?

kloczek commented 2 years ago

That issue has nothing to do with rpm. You can reproduce that using oprocedure which I've described. Just plese run pytest.

JessicaTegner commented 2 years ago

Ohh. The reason for it, is because examples, documentation files and tests has been removed from the pypi release from version 1.8, because of some conflicting names, when installing

kloczek commented 2 years ago

I'm not using pypu sdist but tar atogenerated from git tag. https://github.com/NicklasTegner/pypandoc/archive/refs/tags/v1.8.tar.gz

JessicaTegner commented 2 years ago

It's because when building, from the pyproject.toml file. In 1.8 we have removed the test and other files.

I would suggest download and extracting the tar.gz, then running the tests, and lastly creating the wheel

kloczek commented 2 years ago

I see insise tar ball tests.py.

JessicaTegner commented 2 years ago

yes but they aren't included when you build. When the whl gets produced they aren't included.

kloczek commented 2 years ago

You can check what is inside autogenerated from git tag tar ball https://github.com/NicklasTegner/pypandoc/tree/v1.8

JessicaTegner commented 2 years ago

I know, and in the tarball they are, but my guess is, that when you run the build command, they aren't included, just like when I run python setup.py sdist, because of a change for version 1.8.

My suggestion would be to run pytest before building.

kloczek commented 1 year ago

Just tested 1.9 and looks like new two units are failing

```console + PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib/python3.8/site-packages + /usr/bin/pytest -ra tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion =========================================================================== test session starts ============================================================================ platform linux -- Python 3.8.14, pytest-7.1.3, pluggy-1.0.0 rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9 collected 40 items / 1 deselected / 39 selected tests.py ..........................FF........... [100%] ================================================================================= FAILURES ================================================================================= _____________________________________________________________ TestPypandoc.test_conversion_with_mixed_filters ______________________________________________________________ self = def test_conversion_with_mixed_filters(self): markdown_source = "-0-" lua = """\ function Para(elem) return pandoc.Para(elem.content .. {{"{0}-"}}) end """ lua = textwrap.dedent(lua) python = """\ #!/usr/bin/env python from pandocfilters import toJSONFilter, Para, Str def func(key, value, format, meta): if key == "Para": return Para(value + [Str("{0}-")]) if __name__ == "__main__": toJSONFilter(func) """ python = textwrap.dedent(python) with closed_tempfile(".lua", lua.format(1)) as temp1, closed_tempfile(".py", python.format(2)) as temp2: with closed_tempfile(".lua", lua.format(3)) as temp3, closed_tempfile(".py", python.format(4)) as temp4: > output = pypandoc.convert_text( markdown_source, to="html", format="md", outputfile=None, filters=[temp1, temp2, temp3, temp4] ).strip() tests.py:381: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:93: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'-0-', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None filters = ['/tmp/tmpsu9lufkd.lua', '/tmp/tmpgux03sxh.py', '/tmp/tmp96de3ep5.lua', '/tmp/tmphc5bl3mo.py'], verify_format = True, sandbox = True, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=True, cworkdir=None): _check_log_handler() _ensure_pandoc_path() if verify_format: format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. args.append("--sandbox") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpgux03sxh.py: E Could not find executable python pypandoc/__init__.py:418: RuntimeError --------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------- /home/tkloczko _____________________________________________________________ TestPypandoc.test_conversion_with_python_filter ______________________________________________________________ self = def test_conversion_with_python_filter(self): markdown_source = "**Here comes the content.**" python_source = '''\ #!/usr/bin/env python """ Pandoc filter to convert all regular text to uppercase. Code, link URLs, etc. are not affected. """ from pandocfilters import toJSONFilter, Str def caps(key, value, format, meta): if key == 'Str': return Str(value.upper()) if __name__ == "__main__": toJSONFilter(caps) ''' python_source = textwrap.dedent(python_source) with closed_tempfile(".py", python_source) as tempfile: > output = pypandoc.convert_text( markdown_source, to='html', format='md', outputfile=None, filters=tempfile ).strip() tests.py:332: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:93: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'**Here comes the content.**', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None, filters = ['/tmp/tmpk2dzvkz1.py'] verify_format = True, sandbox = True, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=True, cworkdir=None): _check_log_handler() _ensure_pandoc_path() if verify_format: format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. args.append("--sandbox") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpk2dzvkz1.py: E Could not find executable python pypandoc/__init__.py:418: RuntimeError --------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------- /home/tkloczko ============================================================================= warnings summary ============================================================================= pypandoc/pandoc_download.py:62 /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9/pypandoc/pandoc_download.py:62: DeprecationWarning: invalid escape sequence \. regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ========================================================================= short test summary info ========================================================================== FAILED tests.py::TestPypandoc::test_conversion_with_mixed_filters - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpgux03sxh... FAILED tests.py::TestPypandoc::test_conversion_with_python_filter - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpk2dzvkz1... ========================================================== 2 failed, 37 passed, 1 deselected, 1 warning in 4.90s =========================================================== ```
kloczek commented 1 year ago

I know, and in the tarball they are, but my guess is, that when you run the build command, they aren't included, just like when I run python setup.py sdist, because of a change for version 1.8.

My suggestion would be to run pytest before building.

On typical rpm package build test suite is always executed after builds and install.

JessicaTegner commented 1 year ago

So for your errors, it seems that in both instances, it can't find the "python" executable when trying to use a python filter. What do you say would be the best solutions? Trying "python3" before "python", since we actually want py3, or the other way around, where we try python3, if regular "python" executable couldn't be found?

kloczek commented 1 year ago

So for your errors, it seems that in both instances, it can't find the "python" executable when trying to use a python filter. What do you say would be the best solutions?

Instead hardcoding "python" executable name use sys.executable.

JessicaTegner commented 1 year ago

Instead hardcoding "python" executable name use sys.executable.

We are not hardcoding the name per say. THe error is from the shibang lines when we test with a python filter.

kloczek commented 1 year ago

We are not hardcoding the name per say. THe error is from the shibang lines when we test with a python filter.

Than instead hardcode python executable in shebang line execute script as sys.executable param.

kloczek commented 1 year ago

I've added tree commits to my build procedure:

Patch:          %{VCS}/commit/b5565358.patch#/%{name}-Updated-readme-with-correct-batches.patch
Patch:          %{VCS}/commit/3e7676dd.patch#/%{name}-Fixed-sort-files-before-processing-292-301.patch
Patch:          %{VCS}/commit/b2738b45.patch#/%{name}-Fixes-test-cases-that-uses-python-while-only-python3.patch

and looks like issue still is present <details

+ PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib/python3.8/site-packages
+ /usr/bin/pytest -ra tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.8.14, pytest-7.1.3, pluggy-1.0.0
rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9
collected 40 items / 1 deselected / 39 selected

tests.py ..........................FF...........                                                                                                                     [100%]

================================================================================= FAILURES =================================================================================
_____________________________________________________________ TestPypandoc.test_conversion_with_mixed_filters ______________________________________________________________

self = <tests.TestPypandoc testMethod=test_conversion_with_mixed_filters>

    def test_conversion_with_mixed_filters(self):
        markdown_source = "-0-"

        lua = """\
        function Para(elem)
            return pandoc.Para(elem.content .. {{"{0}-"}})
        end
        """
        lua = textwrap.dedent(lua)

        python = """\
        #!{0}

        from pandocfilters import toJSONFilter, Para, Str

        def func(key, value, format, meta):
            if key == "Para":
                return Para(value + [Str("{0}-")])

        if __name__ == "__main__":
            toJSONFilter(func)

        """
        python = textwrap.dedent(python)
        python.format(sys.executable)

        with closed_tempfile(".lua", lua.format(1)) as temp1, closed_tempfile(".py", python.format(2)) as temp2:
            with closed_tempfile(".lua", lua.format(3)) as temp3, closed_tempfile(".py", python.format(4)) as temp4:
>               output = pypandoc.convert_text(
                    markdown_source, to="html", format="md", outputfile=None, filters=[temp1, temp2, temp3, temp4]
                ).strip()

tests.py:384:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pypandoc/__init__.py:93: in convert_text
    return _convert_input(source, format, 'string', to, extra_args=extra_args,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

source = b'-0-', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None
filters = ['/tmp/tmpbv7qc_dw.lua', '/tmp/tmpi8dvhe90.py', '/tmp/tmpf8udfv4c.lua', '/tmp/tmpb_8hsrr8.py'], verify_format = True, sandbox = True, cworkdir = None

    def _convert_input(source, format, input_type, to, extra_args=(),
                       outputfile=None, filters=None, verify_format=True,
                       sandbox=True, cworkdir=None):

        _check_log_handler()
        _ensure_pandoc_path()

        if verify_format:
            format, to = _validate_formats(format, to, outputfile)
        else:
            format = normalize_format(format)
            to = normalize_format(to)

        string_input = input_type == 'string'
        if not string_input:
            if isinstance(source, str):
                input_file = [source]
            else:
                input_file = source
        else:
            input_file = []

        input_file = sorted(input_file)
        args = [__pandoc_path, '--from=' + format]

        args.append('--to=' + to)

        args += input_file

        if outputfile:
            args.append("--output=" + str(outputfile))

        if sandbox:
            if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above.
                args.append("--sandbox")

        args.extend(extra_args)

        # adds the proper filter syntax for each item in the filters list
        if filters is not None:
            if isinstance(filters, string_types):
                filters = filters.split()
            f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters]
            args.extend(f)

        # To get access to pandoc-citeproc when we use a included copy of pandoc,
        # we need to add the pypandoc/files dir to the PATH
        new_env = os.environ.copy()
        files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files")
        new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path
        creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows

        old_wd = os.getcwd()
        if cworkdir and old_wd != cworkdir:
            os.chdir(cworkdir)

        p = subprocess.Popen(
            args,
            stdin=subprocess.PIPE if string_input else None,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            env=new_env,
            creationflags=creation_flag)

        if cworkdir is not None:
            os.chdir(old_wd)

        # something else than 'None' indicates that the process already terminated
        if not (p.returncode is None):
            raise RuntimeError(
                'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode,
                                                                               p.stderr.read())
            )

        if string_input:
            try:
                source = cast_bytes(source, encoding='utf-8')
            except (UnicodeDecodeError, UnicodeEncodeError):
                # assume that it is already a utf-8 encoded string
                pass
        try:
            stdout, stderr = p.communicate(source if string_input else None)
        except OSError:
            # this is happening only on Py2.6 when pandoc dies before reading all
            # the input. We treat that the same as when we exit with an error...
            raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode))

        try:
            stdout = stdout.decode('utf-8')
        except UnicodeDecodeError:
            # this shouldn't happen: pandoc more or less guarantees that the output is utf-8!
            raise RuntimeError('Pandoc output was not utf-8.')

        try:
            stderr = stderr.decode('utf-8')
        except UnicodeDecodeError:
            # this shouldn't happen: pandoc more or less guarantees that the output is utf-8!
            raise RuntimeError('Pandoc output was not utf-8.')

        # check that pandoc returned successfully
        if p.returncode != 0:
>           raise RuntimeError(
                'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr)
            )
E           RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpi8dvhe90.py:
E           Could not find executable python

pypandoc/__init__.py:420: RuntimeError
--------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------
/home/tkloczko
_____________________________________________________________ TestPypandoc.test_conversion_with_python_filter ______________________________________________________________

self = <tests.TestPypandoc testMethod=test_conversion_with_python_filter>

    def test_conversion_with_python_filter(self):
        markdown_source = "**Here comes the content.**"
        python_source = '''\
        #!{0}

        """
        Pandoc filter to convert all regular text to uppercase.
        Code, link URLs, etc. are not affected.
        """

        from pandocfilters import toJSONFilter, Str

        def caps(key, value, format, meta):
            if key == 'Str':
                return Str(value.upper())

        if __name__ == "__main__":
            toJSONFilter(caps)
        '''
        python_source = textwrap.dedent(python_source)
        python_source.format(sys.executable)

        with closed_tempfile(".py", python_source) as tempfile:
>           output = pypandoc.convert_text(
                markdown_source, to='html', format='md', outputfile=None, filters=tempfile
            ).strip()

tests.py:334:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pypandoc/__init__.py:93: in convert_text
    return _convert_input(source, format, 'string', to, extra_args=extra_args,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

source = b'**Here comes the content.**', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None, filters = ['/tmp/tmp2o_r5jt_.py']
verify_format = True, sandbox = True, cworkdir = None

    def _convert_input(source, format, input_type, to, extra_args=(),
                       outputfile=None, filters=None, verify_format=True,
                       sandbox=True, cworkdir=None):

        _check_log_handler()
        _ensure_pandoc_path()

        if verify_format:
            format, to = _validate_formats(format, to, outputfile)
        else:
            format = normalize_format(format)
            to = normalize_format(to)

        string_input = input_type == 'string'
        if not string_input:
            if isinstance(source, str):
                input_file = [source]
            else:
                input_file = source
        else:
            input_file = []

        input_file = sorted(input_file)
        args = [__pandoc_path, '--from=' + format]

        args.append('--to=' + to)

        args += input_file

        if outputfile:
            args.append("--output=" + str(outputfile))

        if sandbox:
            if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above.
                args.append("--sandbox")

        args.extend(extra_args)

        # adds the proper filter syntax for each item in the filters list
        if filters is not None:
            if isinstance(filters, string_types):
                filters = filters.split()
            f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters]
            args.extend(f)

        # To get access to pandoc-citeproc when we use a included copy of pandoc,
        # we need to add the pypandoc/files dir to the PATH
        new_env = os.environ.copy()
        files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files")
        new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path
        creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows

        old_wd = os.getcwd()
        if cworkdir and old_wd != cworkdir:
            os.chdir(cworkdir)

        p = subprocess.Popen(
            args,
            stdin=subprocess.PIPE if string_input else None,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            env=new_env,
            creationflags=creation_flag)

        if cworkdir is not None:
            os.chdir(old_wd)

        # something else than 'None' indicates that the process already terminated
        if not (p.returncode is None):
            raise RuntimeError(
                'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode,
                                                                               p.stderr.read())
            )

        if string_input:
            try:
                source = cast_bytes(source, encoding='utf-8')
            except (UnicodeDecodeError, UnicodeEncodeError):
                # assume that it is already a utf-8 encoded string
                pass
        try:
            stdout, stderr = p.communicate(source if string_input else None)
        except OSError:
            # this is happening only on Py2.6 when pandoc dies before reading all
            # the input. We treat that the same as when we exit with an error...
            raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode))

        try:
            stdout = stdout.decode('utf-8')
        except UnicodeDecodeError:
            # this shouldn't happen: pandoc more or less guarantees that the output is utf-8!
            raise RuntimeError('Pandoc output was not utf-8.')

        try:
            stderr = stderr.decode('utf-8')
        except UnicodeDecodeError:
            # this shouldn't happen: pandoc more or less guarantees that the output is utf-8!
            raise RuntimeError('Pandoc output was not utf-8.')

        # check that pandoc returned successfully
        if p.returncode != 0:
>           raise RuntimeError(
                'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr)
            )
E           RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmp2o_r5jt_.py:
E           Could not find executable python

pypandoc/__init__.py:420: RuntimeError
--------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------
/home/tkloczko
============================================================================= warnings summary =============================================================================
pypandoc/pandoc_download.py:62
  /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9/pypandoc/pandoc_download.py:62: DeprecationWarning: invalid escape sequence \.
    regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================= short test summary info ==========================================================================
FAILED tests.py::TestPypandoc::test_conversion_with_mixed_filters - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpi8dvhe90...
FAILED tests.py::TestPypandoc::test_conversion_with_python_filter - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmp2o_r5jt_...
========================================================== 2 failed, 37 passed, 1 deselected, 1 warning in 4.93s ===========================================================

kloczek commented 1 year ago

Additionally after --deselect failing units I see some warnings:

+ PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib/python3.8/site-packages
+ /usr/bin/pytest -ra tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion --deselect tests.py::TestPypandoc::test_conversion_with_mixed_filters --deselect tests.py::TestPypandoc::test_conversion_with_python_filter
=========================================================================== test session starts ============================================================================
platform linux -- Python 3.8.14, pytest-7.1.3, pluggy-1.0.0
rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9
collected 40 items / 3 deselected / 37 selected

tests.py .....................................                                                                                                                       [100%]

============================================================================= warnings summary =============================================================================
pypandoc/pandoc_download.py:62
  /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9/pypandoc/pandoc_download.py:62: DeprecationWarning: invalid escape sequence \.
    regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================== 37 passed, 3 deselected, 1 warning in 4.33s ================================================================
JessicaTegner commented 1 year ago

@kloczek can you try the latest development snapshot, that should fix the failing test cases.

kloczek commented 1 year ago

After replace last patch with:

Patch:          %{VCS}/commit/b5565358.patch#/%{name}-Updated-readme-with-correct-batches.patch
Patch:          %{VCS}/commit/3e7676dd.patch#/%{name}-Fixed-sort-files-before-processing-292-301.patch
Patch:          https://github.com/JessicaTegner/pypandoc/commit/b2738b45.patch#/%{name}-Fixes-test-cases-that-uses-python-while-only-python3-is-available.patch

pytest still fails ..

```console + cd pypandoc-1.9 + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w . + /usr/bin/cat /home/tkloczko/rpmbuild/SOURCES/python-pypandoc-Updated-readme-with-correct-batches.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/bin/cat /home/tkloczko/rpmbuild/SOURCES/python-pypandoc-Fixed-sort-files-before-processing-292-301.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f + /usr/bin/cat /home/tkloczko/rpmbuild/SOURCES/python-pypandoc-Fixes-test-cases-that-uses-python-while-only-python3-is-available.patch + /usr/bin/patch -p1 -s --fuzz=0 --no-backup-if-mismatch -f [..] + PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.9-2.fc35.x86_64/usr/lib/python3.8/site-packages + /usr/bin/pytest -ra tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion =========================================================================== test session starts ============================================================================ platform linux -- Python 3.8.14, pytest-7.1.3, pluggy-1.0.0 rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9 collected 40 items / 1 deselected / 39 selected tests.py ..........................FF........... [100%] ================================================================================= FAILURES ================================================================================= _____________________________________________________________ TestPypandoc.test_conversion_with_mixed_filters ______________________________________________________________ self = def test_conversion_with_mixed_filters(self): markdown_source = "-0-" lua = """\ function Para(elem) return pandoc.Para(elem.content .. {{"{0}-"}}) end """ lua = textwrap.dedent(lua) python = """\ #!{0} from pandocfilters import toJSONFilter, Para, Str def func(key, value, format, meta): if key == "Para": return Para(value + [Str("{0}-")]) if __name__ == "__main__": toJSONFilter(func) """ python = textwrap.dedent(python) python.format(sys.executable) with closed_tempfile(".lua", lua.format(1)) as temp1, closed_tempfile(".py", python.format(2)) as temp2: with closed_tempfile(".lua", lua.format(3)) as temp3, closed_tempfile(".py", python.format(4)) as temp4: > output = pypandoc.convert_text( markdown_source, to="html", format="md", outputfile=None, filters=[temp1, temp2, temp3, temp4] ).strip() tests.py:384: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:93: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'-0-', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None filters = ['/tmp/tmpwr8zvwco.lua', '/tmp/tmpiljf6dd6.py', '/tmp/tmpdu_fqeof.lua', '/tmp/tmpwj3gm4v6.py'], verify_format = True, sandbox = True, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=True, cworkdir=None): _check_log_handler() _ensure_pandoc_path() if verify_format: format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. args.append("--sandbox") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpiljf6dd6.py: E Could not find executable python pypandoc/__init__.py:420: RuntimeError --------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------- /home/tkloczko _____________________________________________________________ TestPypandoc.test_conversion_with_python_filter ______________________________________________________________ self = def test_conversion_with_python_filter(self): markdown_source = "**Here comes the content.**" python_source = '''\ #!{0} """ Pandoc filter to convert all regular text to uppercase. Code, link URLs, etc. are not affected. """ from pandocfilters import toJSONFilter, Str def caps(key, value, format, meta): if key == 'Str': return Str(value.upper()) if __name__ == "__main__": toJSONFilter(caps) ''' python_source = textwrap.dedent(python_source) python_source.format(sys.executable) with closed_tempfile(".py", python_source) as tempfile: > output = pypandoc.convert_text( markdown_source, to='html', format='md', outputfile=None, filters=tempfile ).strip() tests.py:334: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:93: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'**Here comes the content.**', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None, filters = ['/tmp/tmpd450lw4k.py'] verify_format = True, sandbox = True, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=True, cworkdir=None): _check_log_handler() _ensure_pandoc_path() if verify_format: format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. args.append("--sandbox") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpd450lw4k.py: E Could not find executable python pypandoc/__init__.py:420: RuntimeError --------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------- /home/tkloczko ============================================================================= warnings summary ============================================================================= pypandoc/pandoc_download.py:62 /home/tkloczko/rpmbuild/BUILD/pypandoc-1.9/pypandoc/pandoc_download.py:62: DeprecationWarning: invalid escape sequence \. regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ========================================================================= short test summary info ========================================================================== FAILED tests.py::TestPypandoc::test_conversion_with_mixed_filters - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpiljf6dd6... FAILED tests.py::TestPypandoc::test_conversion_with_python_filter - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpd450lw4k... ========================================================== 2 failed, 37 passed, 1 deselected, 1 warning in 4.89s =========================================================== ```
JessicaTegner commented 1 year ago

well we are using sys.executable to run the python tests now, so I don't see how it could fail...

kloczek commented 1 year ago

Just retested 1.11 and still I see three units failing

```console + PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.11-2.fc35.x86_64/usr/lib64/python3.8/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.11-2.fc35.x86_64/usr/lib/python3.8/site-packages + /usr/bin/pytest -ra -m 'not network' tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion ==================================================================================== test session starts ==================================================================================== platform linux -- Python 3.8.16, pytest-7.2.2, pluggy-1.0.0 rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.11 collected 41 items / 1 deselected / 40 selected tests.py .......................F...FF........... [100%] ========================================================================================= FAILURES ========================================================================================== _______________________________________________________________________ TestPypandoc.test_conversion_with_data_files ________________________________________________________________________ self = def test_conversion_with_data_files(self): # remove our test.docx file from our test_data dir if it already exosts test_data_dir = os.path.join(os.path.dirname(__file__), 'test_data') test_docx_file = os.path.join(test_data_dir, 'test.docx') if os.path.exists(test_docx_file): os.remove(test_docx_file) > result = pypandoc.convert_file( os.path.join(test_data_dir, 'index.html'), to='docx', format='html', outputfile=test_docx_file, sandbox=True, ) tests.py:240: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:168: in convert_file return _convert_input(discovered_source_files, format, 'path', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = '/home/tkloczko/rpmbuild/BUILD/pypandoc-1.11/test_data/index.html', format = 'html', input_type = 'path', to = 'docx', extra_args = () outputfile = '/home/tkloczko/rpmbuild/BUILD/pypandoc-1.11/test_data/test.docx', filters = None, verify_format = True, sandbox = True, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "97" during conversion: Could not find data file data/data/docx/[Content_Types].xml pypandoc/__init__.py:426: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ______________________________________________________________________ TestPypandoc.test_conversion_with_mixed_filters ______________________________________________________________________ self = def test_conversion_with_mixed_filters(self): markdown_source = "-0-" lua = """\ function Para(elem) return pandoc.Para(elem.content .. {{"{0}-"}}) end """ lua = textwrap.dedent(lua) python = """\ #!{0} from pandocfilters import toJSONFilter, Para, Str def func(key, value, format, meta): if key == "Para": return Para(value + [Str("{0}-")]) if __name__ == "__main__": toJSONFilter(func) """ python = textwrap.dedent(python) python.format(sys.executable) with closed_tempfile(".lua", lua.format(1)) as temp1, closed_tempfile(".py", python.format(2)) as temp2: with closed_tempfile(".lua", lua.format(3)) as temp3, closed_tempfile(".py", python.format(4)) as temp4: > output = pypandoc.convert_text( markdown_source, to="html", format="md", outputfile=None, filters=[temp1, temp2, temp3, temp4] ).strip() tests.py:403: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:91: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'-0-', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None filters = ['/tmp/tmpdgw_df6w.lua', '/tmp/tmpbl813ywg.py', '/tmp/tmp85dsiv3y.lua', '/tmp/tmp7j1t2jod.py'], verify_format = True, sandbox = False, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpbl813ywg.py: E Could not find executable python pypandoc/__init__.py:426: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ______________________________________________________________________ TestPypandoc.test_conversion_with_python_filter ______________________________________________________________________ self = def test_conversion_with_python_filter(self): markdown_source = "**Here comes the content.**" python_source = '''\ #!{0} """ Pandoc filter to convert all regular text to uppercase. Code, link URLs, etc. are not affected. """ from pandocfilters import toJSONFilter, Str def caps(key, value, format, meta): if key == 'Str': return Str(value.upper()) if __name__ == "__main__": toJSONFilter(caps) ''' python_source = textwrap.dedent(python_source) python_source.format(sys.executable) with closed_tempfile(".py", python_source) as tempfile: > output = pypandoc.convert_text( markdown_source, to='html', format='md', outputfile=None, filters=tempfile ).strip() tests.py:353: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:91: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'**Here comes the content.**', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None, filters = ['/tmp/tmpcwer2aku.py'], verify_format = True sandbox = False, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpcwer2aku.py: E Could not find executable python pypandoc/__init__.py:426: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ===================================================================================== warnings summary ====================================================================================== pypandoc/pandoc_download.py:61 /home/tkloczko/rpmbuild/BUILD/pypandoc-1.11/pypandoc/pandoc_download.py:61: DeprecationWarning: invalid escape sequence \. regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================== short test summary info ================================================================================== FAILED tests.py::TestPypandoc::test_conversion_with_data_files - RuntimeError: Pandoc died with exitcode "97" during conversion: Could not find data file data/data/docx/[Content_Types].xml FAILED tests.py::TestPypandoc::test_conversion_with_mixed_filters - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpbl813ywg.py: FAILED tests.py::TestPypandoc::test_conversion_with_python_filter - RuntimeError: Pandoc died with exitcode "83" during conversion: Error running filter /tmp/tmpcwer2aku.py: =================================================================== 3 failed, 37 passed, 1 deselected, 1 warning in 6.46s =================================================================== ```
JessicaTegner commented 1 year ago

@kloczek From the 2 last ones:

        python_source = '''\
        #!{0}

        """
        Pandoc filter to convert all regular text to uppercase.
        Code, link URLs, etc. are not affected.
        """

        from pandocfilters import toJSONFilter, Str

        def caps(key, value, format, meta):
            if key == 'Str':
                return Str(value.upper())

        if __name__ == "__main__":
            toJSONFilter(caps)
        '''
        python_source = textwrap.dedent(python_source)
        python_source.format(sys.executable)

We are setting the shebang line by using "sys.executable", so only reason why it can't run the python filters would be because either the "sys.executable" is incorrect, or is still set to regular python (somehow).

Can you try running something like the following, to check what the sys.executable is set to when running our tests?

import sys

print(sys.executable)

For the first one, the one about the data files. That's an error in the test case, where "sandbox" is specifically set to True, even though the default is now False. THat should be easy enough to fix, by just omitting the sandbox parameter all together

kloczek commented 1 year ago
[tkloczko@pers-jacek SPECS]$ python3
Python 3.8.16 (default, Jan 30 2023, 13:00:00)
[GCC 13.0.1 20230127 (Red Hat 13.0.1-0)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> print(sys.executable)
/usr/bin/python3
>>>
JessicaTegner commented 1 year ago

@kloczek can you test master now, after the work pr pr #328 the python ones should be fixed

jayvdb commented 1 year ago

This issue should be able to be closed now. ping @kloczek

kloczek commented 4 months ago

Hmm .. just retested 1.13 and pytest still fails in 3 units 🤔

Here is pytest output: ```console + PYTHONPATH=/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.13-4.fc37.x86_64/usr/lib64/python3.10/site-packages:/home/tkloczko/rpmbuild/BUILDROOT/python-pypandoc-1.13-4.fc37.x86_64/usr/lib/python3.10/site-packages + /usr/bin/pytest -ra -m 'not network' tests.py --deselect tests.py::TestPypandoc::test_pdf_conversion ==================================================================================== test session starts ==================================================================================== platform linux -- Python 3.10.14, pytest-8.1.1, pluggy-1.4.0 rootdir: /home/tkloczko/rpmbuild/BUILD/pypandoc-1.13 configfile: pyproject.toml collected 41 items / 1 deselected / 40 selected tests.py .......................F...FF........... [100%] ========================================================================================= FAILURES ========================================================================================== _______________________________________________________________________ TestPypandoc.test_conversion_with_data_files ________________________________________________________________________ self = def test_conversion_with_data_files(self): # remove our test.docx file from our test_data dir if it already exosts test_data_dir = os.path.join(os.path.dirname(__file__), 'test_data') test_docx_file = os.path.join(test_data_dir, 'test.docx') if os.path.exists(test_docx_file): os.remove(test_docx_file) > result = pypandoc.convert_file( os.path.join(test_data_dir, 'index.html'), to='docx', format='html', outputfile=test_docx_file, sandbox=True, ) tests.py:240: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:200: in convert_file return _convert_input(discovered_source_files, format, 'path', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = '/home/tkloczko/rpmbuild/BUILD/pypandoc-1.13/test_data/index.html', format = 'html', input_type = 'path', to = 'docx', extra_args = () outputfile = '/home/tkloczko/rpmbuild/BUILD/pypandoc-1.13/test_data/test.docx', filters = None, verify_format = True, sandbox = True cworkdir = '/home/tkloczko/rpmbuild/BUILD/pypandoc-1.13' def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: if not (to in ["odt", "docx", "epub", "epub3", "pdf"] and outputfile == "-"): stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "97" during conversion: Could not find data file data/data/docx/[Content_Types].xml pypandoc/__init__.py:467: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ______________________________________________________________________ TestPypandoc.test_conversion_with_mixed_filters ______________________________________________________________________ self = def test_conversion_with_mixed_filters(self): markdown_source = "-0-" lua = """\ function Para(elem) return pandoc.Para(elem.content .. {{"{0}-"}}) end """ lua = textwrap.dedent(lua) python = """\ #!{0} from pandocfilters import toJSONFilter, Para, Str def func(key, value, format, meta): if key == "Para": return Para(value + [Str("{{0}}-")]) if __name__ == "__main__": toJSONFilter(func) """ python = textwrap.dedent(python) python = python.format(sys.executable) with closed_tempfile(".lua", lua.format(1)) as temp1, closed_tempfile(".py", python.format(2)) as temp2: os.chmod(temp2, 0o755) with closed_tempfile(".lua", lua.format(3)) as temp3, closed_tempfile(".py", python.format(4)) as temp4: os.chmod(temp4, 0o755) > output = pypandoc.convert_text( markdown_source, to="html", format="md", outputfile=None, filters=[temp1, temp2, temp3, temp4] ).strip() tests.py:408: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:92: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'-0-', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None filters = ['/tmp/tmpjpqbkha4.lua', '/tmp/tmpslzo_5p7.py', '/tmp/tmpv3iwkln3.lua', '/tmp/tmppf4ooinv.py'], verify_format = True, sandbox = False, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: if not (to in ["odt", "docx", "epub", "epub3", "pdf"] and outputfile == "-"): stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Traceback (most recent call last): E File "/tmp/tmpslzo_5p7.py", line 3, in E from pandocfilters import toJSONFilter, Para, Str E ModuleNotFoundError: No module named 'pandocfilters' E Error running filter /tmp/tmpslzo_5p7.py: E Filter returned error status 1 pypandoc/__init__.py:467: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ______________________________________________________________________ TestPypandoc.test_conversion_with_python_filter ______________________________________________________________________ self = def test_conversion_with_python_filter(self): markdown_source = "**Here comes the content.**" python_source = '''\ #!{0} """ Pandoc filter to convert all regular text to uppercase. Code, link URLs, etc. are not affected. """ from pandocfilters import toJSONFilter, Str def caps(key, value, format, meta): if key == 'Str': return Str(value.upper()) if __name__ == "__main__": toJSONFilter(caps) ''' python_source = textwrap.dedent(python_source) python_source = python_source.format(sys.executable) with closed_tempfile(".py", python_source) as tempfile: os.chmod(tempfile, 0o755) > output = pypandoc.convert_text( markdown_source, to='html', format='md', outputfile=None, filters=tempfile ).strip() tests.py:354: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pypandoc/__init__.py:92: in convert_text return _convert_input(source, format, 'string', to, extra_args=extra_args, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ source = b'**Here comes the content.**', format = 'markdown', input_type = 'string', to = 'html', extra_args = (), outputfile = None, filters = ['/tmp/tmp3f9k2vwi.py'], verify_format = True sandbox = False, cworkdir = None def _convert_input(source, format, input_type, to, extra_args=(), outputfile=None, filters=None, verify_format=True, sandbox=False, cworkdir=None): _check_log_handler() logger.debug("Ensuring pandoc path...") _ensure_pandoc_path() if verify_format: logger.debug("Verifying format...") format, to = _validate_formats(format, to, outputfile) else: format = normalize_format(format) to = normalize_format(to) logger.debug("Identifying input type...") string_input = input_type == 'string' if not string_input: if isinstance(source, str): input_file = [source] else: input_file = source else: input_file = [] input_file = sorted(input_file) args = [__pandoc_path, '--from=' + format] args.append('--to=' + to) args += input_file if outputfile: args.append("--output=" + str(outputfile)) if sandbox: if ensure_pandoc_minimal_version(2,15): # sandbox was introduced in pandoc 2.15, so only add if we are using 2.15 or above. logger.debug("Adding sandbox argument...") args.append("--sandbox") else: logger.warning("Sandbox argument was used, but pandoc version is too low. Ignoring argument.") args.extend(extra_args) # adds the proper filter syntax for each item in the filters list if filters is not None: if isinstance(filters, string_types): filters = filters.split() f = ['--lua-filter=' + x if x.endswith(".lua") else '--filter=' + x for x in filters] args.extend(f) # To get access to pandoc-citeproc when we use a included copy of pandoc, # we need to add the pypandoc/files dir to the PATH new_env = os.environ.copy() files_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), "files") new_env["PATH"] = new_env.get("PATH", "") + os.pathsep + files_path creation_flag = 0x08000000 if sys.platform == "win32" else 0 # set creation flag to not open pandoc in new console on windows old_wd = os.getcwd() if cworkdir and old_wd != cworkdir: os.chdir(cworkdir) logger.debug("Running pandoc...") p = subprocess.Popen( args, stdin=subprocess.PIPE if string_input else None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=new_env, creationflags=creation_flag) if cworkdir is not None: os.chdir(old_wd) # something else than 'None' indicates that the process already terminated if not (p.returncode is None): raise RuntimeError( 'Pandoc died with exitcode "%s" before receiving input: %s' % (p.returncode, p.stderr.read()) ) if string_input: try: source = cast_bytes(source, encoding='utf-8') except (UnicodeDecodeError, UnicodeEncodeError): # assume that it is already a utf-8 encoded string pass try: stdout, stderr = p.communicate(source if string_input else None) except OSError: # this is happening only on Py2.6 when pandoc dies before reading all # the input. We treat that the same as when we exit with an error... raise RuntimeError('Pandoc died with exitcode "%s" during conversion.' % (p.returncode)) try: if not (to in ["odt", "docx", "epub", "epub3", "pdf"] and outputfile == "-"): stdout = stdout.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') try: stderr = stderr.decode('utf-8') except UnicodeDecodeError: # this shouldn't happen: pandoc more or less guarantees that the output is utf-8! raise RuntimeError('Pandoc output was not utf-8.') # check that pandoc returned successfully if p.returncode != 0: > raise RuntimeError( 'Pandoc died with exitcode "%s" during conversion: %s' % (p.returncode, stderr) ) E RuntimeError: Pandoc died with exitcode "83" during conversion: Traceback (most recent call last): E File "/tmp/tmp3f9k2vwi.py", line 8, in E from pandocfilters import toJSONFilter, Str E ModuleNotFoundError: No module named 'pandocfilters' E Error running filter /tmp/tmp3f9k2vwi.py: E Filter returned error status 1 pypandoc/__init__.py:467: RuntimeError ----------------------------------------------------------------------------------- Captured stdout call ------------------------------------------------------------------------------------ /home/tkloczko ===================================================================================== warnings summary ====================================================================================== pypandoc/pandoc_download.py:61 /home/tkloczko/rpmbuild/BUILD/pypandoc-1.13/pypandoc/pandoc_download.py:61: DeprecationWarning: invalid escape sequence '\.' regex = re.compile(r"/jgm/pandoc/releases/download/.*(?:"+processor_architecture+"|x86|mac).*\.(?:msi|deb|pkg)") -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html ================================================================================== short test summary info ================================================================================== FAILED tests.py::TestPypandoc::test_conversion_with_data_files - RuntimeError: Pandoc died with exitcode "97" during conversion: Could not find data file data/data/docx/[Content_Types].xml FAILED tests.py::TestPypandoc::test_conversion_with_mixed_filters - RuntimeError: Pandoc died with exitcode "83" during conversion: Traceback (most recent call last): FAILED tests.py::TestPypandoc::test_conversion_with_python_filter - RuntimeError: Pandoc died with exitcode "83" during conversion: Traceback (most recent call last): =================================================================== 3 failed, 37 passed, 1 deselected, 1 warning in 4.19s =================================================================== ```
JessicaTegner commented 4 months ago

@kloczek Can you try the following: