typst / svg2pdf

Converts SVG files to PDF.
Apache License 2.0
287 stars 34 forks source link

Fixes #9 #19

Closed LaurenzV closed 1 year ago

LaurenzV commented 1 year ago

Hey, so I did some digging into #9, and I believe I found out a fix for it (I'm relatively new to Rust and everything related to PDFs though, so I can't guarantee that some of the changes have some side effects, but I hope not. πŸ˜…

So, essentially I fixed three different problems, the first one is not directly related to the example mentioned in the issue, but I noticed it while messing around with it.

Transformation for clip paths

So, firstly what I noticed is that transformations on clip path objects are completely ignored sometimes. Consider this svg file for example:

<svg height="500" width="500" xmlns="http://www.w3.org/2000/svg">
    <rect width="200" height="200" />
    <clipPath id="a" transform="translate(100, 100)">
        <path d="m 0 0 h 100 v 100 h -100 z"/>
    </clipPath>
    <rect width="200" height="200" clip-path="url(#a)" fill="green" />
</svg>

Which is supposed to look like this:

image

But it currently is rendered like this: image

The green rectangle should be shown in the bottom right corner of the black one because the clip path is translated by (100, 100),

To understand why this is the case, I created a small function that prints out all of the nodes of the tree including their level, and for this example it looks like this (I cut off the unrelevant parts)

...
Level 2
ClipPath(
    ClipPath {
        id: "a",
        units: UserSpaceOnUse,
        transform: Transform {
            a: 1.0,
            b: 0.0,
            c: 0.0,
            d: 1.0,
            e: 100.0,
            f: 100.0,
        },
        clip_path: None,
    },
)

Level 3
Path(
    Path {
        id: "",
        transform: Transform {
            a: 1.0,
            b: 0.0,
            c: 0.0,
            d: 1.0,
            e: 0.0,
            f: 0.0,
        },
        ...
    },
)
...

So as you can see, the transform is stored in the ClipPath object instead of the Path itself (which makes sense), but this is a "problem" because in the code that creates the clip paths:

fn apply_clip_path(path_id: Option<&String>, content: &mut Content, ctx: &mut Context) {
    if let Some(clip_path) = path_id.and_then(|id| ctx.tree.defs_by_id(id)) {
        if let NodeKind::ClipPath(ref path) = *clip_path.borrow() {
            apply_clip_path(path.clip_path.as_ref(), content, ctx);

            for child in clip_path.children() {
                match *child.borrow() {
                    NodeKind::Path(ref path) => {
                        draw_path(&path.data.0, path.transform, content, &ctx.c);
                        content.clip_nonzero();
                        content.end_path();
                    }
                    NodeKind::ClipPath(_) => {}
                    _ => unreachable!(),
                }
            }
        } else {
            unreachable!();
        }
    }
}

The path variable in line 3 contains the data for the clip path (including the transformations), but it is never used! Instead, we just iterate through all of the children and then draw them with their transformations, but we ignore the transformation of the clip path itself. I copied over the code from how group handles this:

fn apply_clip_path(path_id: Option<&String>, content: &mut Content, ctx: &mut Context) {
    if let Some(clip_path) = path_id.and_then(|id| ctx.tree.defs_by_id(id)) {
        if let NodeKind::ClipPath(ref path) = *clip_path.borrow() {
            apply_clip_path(path.clip_path.as_ref(), content, ctx);

            let old = ctx.c.transform([
                path.transform.a,
                path.transform.b,
                path.transform.c,
                path.transform.d,
                path.transform.e,
                path.transform.f,
            ]);

            for child in clip_path.children() {
                match *child.borrow() {
                    NodeKind::Path(ref path) => {
                        draw_path(&path.data.0, path.transform, content, &ctx.c);
                        content.clip_nonzero();
                        content.end_path();
                    }
                    NodeKind::ClipPath(_) => {}
                    _ => unreachable!(),
                }
            }

            ctx.c.transform(old);
        } else {
            unreachable!();
        }
    }
}

And this seems to fix the problem for the green square. I also checked the other test svgs and they look the same, so this change shouldn't break anything else. However, this doesn't actually solve the problem with group transformations outlined above.

Group transformation in the wrong order

Consider this example next (adapted from the clip path example above:

<svg height="400" viewBox="0 0 400 400" width="400" xmlns="http://www.w3.org/2000/svg">
    <clipPath id="a">
        <path d="m 0 0 h 200 v 200 h -200 z"/>
    </clipPath>
    <g clip-path="url(#a)" transform="translate(40 40)">
        <path d="m 0 0 200 200" stroke="#000" stroke-width="3"/>
    </g>
    <path d="m 40 40 h 200 v 200 h -200 z" fill="none" stroke="#f00"/>
</svg>

How it is supposed to look:

image

How it actually looks: image

As you can see, the transform is simply not applied to the line. And the reason for this is in the render method for Group objects:

...
ctx.push();

let group_ref = ctx.alloc_ref();
let child_content = content_stream(&node, writer, ctx);

let bbox = node
    .calculate_bbox()
    .and_then(|b| b.to_rect())
    .unwrap_or_else(|| usvg::Rect::new(0.0, 0.0, 1.0, 1.0).unwrap());

let pdf_bbox = ctx.c.pdf_rect(bbox);
let old = ctx.c.transform([
    self.transform.a,
    self.transform.b,
    self.transform.c,
    self.transform.d,
    self.transform.e,
    self.transform.f,
]);
...

As you can see, we first render the child contents and only then do we apply the transformations to the context, meaning that the children won't be affected anymore. I put child_content after old and this seemed to fix it. However, this only fixes it for the case where x = y, so if we try this:

<svg height="400" viewBox="0 0 400 400" width="400" xmlns="http://www.w3.org/2000/svg">
    <clipPath id="a">
        <path d="m 0 0 h 200 v 200 h -200 z"/>
    </clipPath>
    <g clip-path="url(#a)" transform="translate(0 40)">
        <path d="m 0 0 200 200" stroke="#000" stroke-width="3"/>
    </g>
    <path d="m 0 40 h 200 v 200 h -200 z" fill="none" stroke="#f00"/>
</svg>

We still get a wrong result:

image

And the reason this is the case becomes clear when we look at the tree again:

Level 1
Group(
    Group {
        id: "",
        transform: Transform {
            a: 1.0,
            b: 0.0,
            c: 0.0,
            d: 1.0,
            e: 0.0,
            f: 40.0,
        },
        opacity: NormalizedValue(
            1.0,
        ),
        clip_path: Some(
            "a",
        ),
        mask: None,
        filter: [],
        filter_fill: None,
        filter_stroke: None,
        enable_background: None,
    },
)

So as you can see, the transformation is stored correctly in the group object, but the problem is that because of the fix we implemented for the first problem, the apply_clip_path method will now override the transformation from the group because of the way the transform() method is currently implemented. I might be wrong about this, but I think it should instead be implemented by chaining the transformations together instead of replacing them? This is why I rewrote CoordToPdf implementation to use aTransform object internally instead of a matrix. I'm not sure if there was a particular reason why an array of length 6 was chosen to store it instead of a Transform object, so let me know if this should be changed. But anyway, this change fixes the output for me, and the other ones seem to work just like before (unfortunately it still doesn't fix #17, but I haven't had the chance to look into that one yet).

And as I mentioned, I'm relatively new to all of this and this is my first PR, so I'd appreciate if someone could look over it to make sure the changes are fine and don't have any side effects, so that it doesn't break anything else. πŸ˜… Also, I didn't quite get what the best way to test all of this is, do you currently just run the test and then look over the generated PDFs manually to make sure everything looks okay?

reknih commented 1 year ago

Hey!

Thank you for your PR, it looks spectacular, especially your write-up! Bug number 1 was an oversight, you went for the simplest fix, great! This should not have any unintended side effects. Similarly, the ordering of transform application and child rendering was one of my fumbles and looks a lot nicer now.

We did not use usvg's transform because we did not want to depend on their Transform type to remain stable across crate updates. However, their transform type had the neccessary impls and our didn't so a good choice to just use something that works. I have looked at the type, and within memory, it's also really just 6 f64s, just in a nicer package.

Overall, I think that svg2pdf's way of handling transforms is too impromptu and we are missing the right abstraction. It is too easy to forget them or to apply them in a wrong order. I wrote this crate in a bit of a rush πŸ€•. Similarly, we should probably have reference images to compare the rendered PDFs against, but at the moment, changes have to be tested by manually inspecting the PDFs.

Your code to print out the usvg tree looks really useful! If you want to, you could add that as a utility function for debugging in another PR!

As it stands, this PR LGTM, thank you!