weisJ / jsvg

Java SVG renderer
MIT License
129 stars 7 forks source link

Feature: Resolve `<use href` from external file #82

Closed stanio closed 2 months ago

stanio commented 3 months ago

I have a collection of SVG images (mouse pointers) with groups sharing common shapes – see for example:

Pretty much all of these share the same base pointer shape from left_ptr.svg. I would like to assign an id to a shape in one of the files, and refer to it in others:

file1.svg:

  <path id="common-shape" ... />

file2.svg:

  <use href="file1.svg#common-shape" />

in order to reduce the maintenance cost of updating the shape in all relevant files.

I've seen it possible with Batik, and Inkscape happily opens such files for editing. It doesn't appear currently possible with JSVG, at least I don't know how to achieve it if possible. I've seen there is a ResourceLoader interface:

https://github.com/weisJ/jsvg/blob/6b303887c8e71b638ffbf998abe0d3727be5864d/jsvg/src/main/java/com/github/weisj/jsvg/parser/SVGLoader.java#L68-L70

https://github.com/weisJ/jsvg/blob/6b303887c8e71b638ffbf998abe0d3727be5864d/jsvg/src/main/java/com/github/weisj/jsvg/parser/ResourceLoader.java#L32-L37

but it looks like for a different purpose.

Does it sound like a useful feature to enable resolving external resources via an explicit configuration?


Sample files:

weisJ commented 3 months ago

If this is added it should be definitely made opt-in through explicit configuration. Loading arbitrary files is a rather large security risk otherwise. Some thoughts on this:

If the files contain common elements then they should also be deduplicated during loading. This would require some sort of cache. However I do not want to assume that one wants this behaviour as it implicitly keeps memory around. The implementation will most likely involve a separate handler class to resolve such external resources. The API could then be made such that it is easy to wrap such a handler to make it work with a cache.

Resolving referenced files if the loaded file itself is on the filesystem is straight forward. Not sure how it would work if the files are embedded resources in a jar file. Might need some help from the user side in this case.

weisJ commented 3 months ago

I have implemented an experimental mechanism to enable loading fragments from external files:

LoaderContext loaderContext = LoaderContext.createDefault();
loaderContext.elementLoader().enableLoadingExternalElements(true);
// use loaderContext when loading the document with SVGLoader
weisJ commented 3 months ago

Note that this only works if the svg is loaded using methods of SVGLoader that accept an URL parameter. (Or alternatively use the version that allows passing in a root URI for resolving external documents. The LoaderContext cashes resolved documents and can be shared across multiple loads to avoid parsing documents multiple times.

stanio commented 3 months ago

I have taken a quick look at it and tried it. enableLoadingExternalElements is too coarse, in my opinion. For my use, I would need more control over the behavior like implementing same-origin policy or selectively allowing well-known public resources by serving them from pre-packaged local copies. I don't want to allow loading a local resource to trigger access to the network, for sure.

For my more specific use case, I need a way to feed JSVG some kind of "preloaded" DOM for these external resources as I need to modify them dynamically. I'm currently doing this for the main SVG:

  1. I load and parse an SVG into a org.w3c.Document once;
  2. I perform several XSL transformations and direct DOM manipulations; then
  3. I feed JSVG with the result DOM "directly" via StaxSVGLoader configured with a custom XMLInputFactory. At least I'm not serializing and parsing XML text back and forth.

I perform steps (2) and (3) multiple times for the same source loaded during (1). I need to be able to transform requested external resources similarly, so I was thinking if the LoaderContext could be configured with a user implementation of something like:

public interface DocumentResolver {
    SVGDocument resolve(String href, String base);
}

It would allow enough user control, and as far as I can tell it doesn't need to expose more than what's already public.

You may ignore that whole last part as I don't have better-defined thoughts, at this time. 😄 I just think to be minimally usable it shouldn't allow or should be able to configure a cross-origin access policy.

weisJ commented 3 months ago

The current implementation is actually only capable of loading documents relative to the URI of the current document.

I have pushed changes which allow for more fine grained user control of the resulting URI of external documents:

The current (pre-current commit) behaviour is equivalent to

LoaderContext loaderContext = LoaderContext.builder()
                            .elementLoader(ElementLoader.create(ExternalDocumentPolicy.ALLOW_RELATIVE))
                            .build();

You can specify custom resolution behaviour with a custom ExternalDocumentPolicy:

interface ExternalDocumentPolicy {
    @Nullable
    URI resolveDocumentURI(@NotNull URI baseDocumentUri, @NotNull String documentPath);
}
stanio commented 3 months ago
LoaderContext loaderContext = LoaderContext.builder()
                            .elementLoader(ElementLoader.create(ExternalDocumentPolicy.ALLOW_RELATIVE))
                            .build();

You can specify custom resolution behaviour with a custom ExternalDocumentPolicy

I think this will be just enough to support this feature. My use case for transforming the referenced document is a separate one, and it would need some pre-processing hook as I've seen possible for the main document (usage examples).

weisJ commented 3 months ago

The referenced document could use the same processor as the main document (the one specified in the LoaderContext).

For this ParserProvider#createPreProcessor should probably take the current document URI as an extra argument to distinguish what processor to use (i.e. main document and referenced document).

stanio commented 3 months ago

The referenced document could use the same processor as the main document...

Yup, I have mentioned it as I have tried it and it hasn't been the case.

weisJ commented 3 months ago

Latest changes add pre-processing of external documents.

stanio commented 3 months ago

I've tried out the registered ParserProvider.createPreProcessor() is now invoked for external <use href> documents, using the latest 1.5.1-SNAPSHOT. I'm not using JSVG DOM manipulation for the main/existing use case, or this new one in a real-world application, yet, but it is good it is available. Others may come up with more specific suggestions to possibly improve it further.

weisJ commented 3 months ago

Remaining thing to implement: Check what circular dependencies are handled and detect them. There should probably also be a limit to the number of indirections for external documents (i.e. the depth of external elements referencing external elements).

weisJ commented 2 months ago

Now available in 1.6.0