DCsunset / pandoc-include

A pandoc filter to allow file and header inclusion
MIT License
67 stars 14 forks source link

`rewrite-path` and `include-entry` are not working together properly #54

Closed FynnFreyer closed 3 months ago

FynnFreyer commented 4 months ago

Expected

When generating a PDF from a file in a subdirectory, that has include-entry set, and is including a file with an image, that is referenced relative to the included file, the produced PDF includes the image.

At least that's expected, if we have rewrite-path set to true, which is the default.

Actual

The image is not included.

Reproduce

Install pandoc-include into an otherwise empty venv in version 1.3.2.

Create a directory structure like this:

.
├── dir
│   ├── img.md
│   ├── img.svg
│   └── test_a.md
└── test_b.md

Contents of img.md

# Img

File with image.

![img](img.svg)

Path to img.svg is relative to the directory this file resides in, which should be okay with rewrite-path: true.

Contents of test_a.md

---
include-entry: dir
rewrite-path: true
---

# test

!include img.md

The include-entry is set relative to the working directory, rewrite-path is set (already the default, but we make it explicit).

Contents of test_b.md

---
rewrite-path: true
---

# test

!include dir/img.md

This file resides in the working directory.

Results

We run

pandoc --filter=pandoc-include dir/test_a.md -o test_a.pdf
pandoc --filter=pandoc-include test_b.md -o test_b.pdf

The test_b.pdf file contains the picture, while test_a.pdf does not.

FynnFreyer commented 4 months ago

I sent PR #55 with a minor change.

Instead of rewriting the image URL relative to options["current-path"], it just puts in the absolute one. Since we're changing directories to the included file, this should work. It's kind of blunt, but shouldn't mess with anything else.

It solves the situation described in this issue, and I also tried with the following structure and that works as well.

├── dir
│   ├── img.svg
│   └── test_a.md
└── img.md

And, to be quite honest, I can't be bothered to add proper test cases etc., so sry about that.

DCsunset commented 4 months ago

That's a good catch! Your solution makes sense when the output is in a format that could embed the image.

However, in some cases, users may want to keep the relative path to make the output file more portable. For example, a user may want to compile a bunch of markdown files into one but keep the relative resource link so they can publish to another place.

I think we can rewrite the path to the absolute path by default, but at the same time we should provide an option to allow relative path rewrite as well.

@FynnFreyer Could you add an option like rewrite-path-relative: boolean which allows relative path rewrite? This option should default to false.

DCsunset commented 4 months ago

Oh I just realized that we don't need another option. We can allow rewrite-path to accept both str and bool.

FynnFreyer commented 4 months ago

in some cases, users may want to keep the relative path to make the output file more portable

That makes sense, but in that case, I'd suggest to just do it with relative paths again. One could compare current-path and process-path^[1] and based on the outcome do something with os.path.relpath for translating. That was my initial idea, but the quick hack required much less thinking, so I went with that instead :sweat_smile:

Unfortunately, I don't have the time to do this properly right now, since I'm in the middle of writing my thesis^[2], so I'll try to get back to this in about three weeks, but no promises that I won't forget.

[1]: or something similar, not sure what the exact key was [2]: which is why I'm using pandoc-include in the first place, so thanks a lot for building this! :wink:

FynnFreyer commented 4 months ago

Also, the following is pretty off-topic, but maybe I can pick your brain on this.

There's often lively discussion on how to do transclusion syntax in markdown,^[1] with a few popular ideas coming from iA Writer^[2], Obsidian^[3] and MultiMarkdown^[4].

Pandoc-include is doing !include`<args>` path/to/file, which is nice, because it allows for passing arguments to the directive, like you would in reST. But it is a bit verbose and doesn't offer graceful degradation in other contexts.

I'm using Obsidian, so personally I'd like to see the image syntax supported. Without having checked this too carefully, it seems like you'd only need to make changes to is_include_line and where it's called. Maybe one could implement support for multiple syntaxes by adding an is_include(elem) function, that calls is_include_line and similar functions for other supported syntaxes.

Do you see any glaring problems with this idea, and if not, would you in theory be open to a PR regarding this? If so, I might open an issue for that and work on it in the future.^[5]

[1]: This old discussion on the commonmark forum gives a decent overview, I think.
[2]: Just paste the URL path/to/file.md
[3]: Use the image syntax ![description](path/to/file.md)
[4]: Use f-string style double braces {{path/to/file.md}}
[5]: Don't hold your breath though.

DCsunset commented 4 months ago

It would be great to support other popular syntaxes to make it compatible with more applications.

I think your idea should work without problem. We just need to add an is_include_line function for each syntax that has the same return type.

However, Instead of hard-coding such functions directly in main.py, I think we should make is_include_line more pluggable using dynamic import in Python. We can support a few popular syntaxes in this repo by default and, at same time, allow users to implement other syntaxes in external packages. To choose what syntax to include, we can provide an option like include-syntaxes. I'm open to PR on this. (This will also solve https://github.com/DCsunset/pandoc-include/issues/34)

DCsunset commented 4 months ago

I just implemented the fix in the latest commit which fixes the relative path rewrite. Feel free to test it when you have time.

FynnFreyer commented 3 months ago

Tested with 1195edc2, and it works in my project, thanks for the quick action

FynnFreyer commented 3 months ago

Damn, my bad, I didn't update my venv properly so tested with the wrong version.

This seems to persist with 1.3.3 unfortunately

DCsunset commented 3 months ago

I've added another fix in the master branch. I've tested the case locally and it works fine. Could you verify it on your side as well?

FynnFreyer commented 3 months ago

sure, but I'll have to hand in my thesis on monday, will check after that

FynnFreyer commented 3 months ago

Solved with 4c4c0b7edc6a0470f4b30f48b515e131eac380f3