meerk40t / svgelements

SVG Parsing for Elements, Paths, and other SVG Objects.
MIT License
127 stars 28 forks source link

Presence of clone of a reference element creates a strange ghost element when parsing SVG file #156

Closed baender closed 1 year ago

baender commented 2 years ago

I am interested in the bounding box of various clones of an element. While using svgelements with SVGs created in inkscape, I stumbled upon an issue.

How to reproduce

Store the xml code as .svg and run the Python code.

What happens

When creating a clone of an element, the .get_element_by_id() method applies the clone transform to the original element, depending on the order of the two elements. If one manually moves the rect element after the use element, the method works correctly and the clone transform is not applied to the original element. However, when creating a clone, the use element is always added after the rect element and therefore forces this issue to occur.

What one expects to happen

When using the method .get_element_by_id() it should not matter whether the element has a clone or not. The clone transform should not be applied at all, no matter the relative order of the two elements.

What might be happening

The xlink:href attribute of the use element holds the original element´s ID and might be parsed at some point. The transform might be mistakenly stored with the original element. Changing the order of the two elements might overwrite the transform string of the original element.

System Python 3.6.8 svgelements 1.6.5

import svgelements

# svgFileName = "path/to/svg"

layout = svgelements.SVG.parse(
    source = svgFileName, 
    reify = True,
    ppi = svgelements.svgelements.DEFAULT_PPI,
    width = 1,
    height = 1,
    color = "black",
    transform = None,
    context = None
)

template = layout.get_element_by_id("rect2728")
template.bbox()

# Result is incorrect; happens when `rect` comes BEFORE `use`
# (40.00001203849226, 119.99999958004639, 1879.9999728433, 199.99999426070744)

# Result is correct; happens when `rect` comes AFTER `use`
# (40.00001203849226, -159.9999893613221, 1879.9999728433, -79.99999468066105)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->

<svg
   width="1920"
   height="1080"
   viewBox="0 0 507.99999 285.75001"
   version="1.1"
   id="svg10274"
   inkscape:version="1.1.1 (3bf5ae0d25, 2021-09-20)"
   sodipodi:docname="test.svg"
   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
   xmlns:xlink="http://www.w3.org/1999/xlink"
   xmlns="http://www.w3.org/2000/svg"
   xmlns:svg="http://www.w3.org/2000/svg">
  <sodipodi:namedview
     id="namedview10276"
     pagecolor="#ffffff"
     bordercolor="#666666"
     borderopacity="1.0"
     inkscape:pageshadow="2"
     inkscape:pageopacity="0.0"
     inkscape:pagecheckerboard="0"
     inkscape:document-units="mm"
     showgrid="true"
     inkscape:zoom="0.36168631"
     inkscape:cx="836.3601"
     inkscape:cy="504.58089"
     inkscape:window-width="1920"
     inkscape:window-height="1001"
     inkscape:window-x="-9"
     inkscape:window-y="-9"
     inkscape:window-maximized="1"
     inkscape:current-layer="layer1"
     units="px">
    <inkscape:grid
       type="xygrid"
       id="grid1868" />
  </sodipodi:namedview>
  <defs
     id="defs10271" />
  <g
     inkscape:label="Layer"
     inkscape:groupmode="layer"
     id="layer1">
    <rect
       style="fill:#ff0000;fill-rule:evenodd;stroke-width:0.2;stroke-linecap:round"
       id="rect2728"
       width="486.83334"
       height="21.166666"
       x="10.583323"
       y="-42.333332" />
    <use
       x="0"
       y="0"
       xlink:href="#rect2728"
       id="use2810"
       transform="translate(0,74.083333)"
       width="100%"
       height="100%" />
  </g>
</svg>
baender commented 2 years ago

To get the bounding box of a clone, I figured that applying the inverse transformation of the template element and then applying the clone´s tranformation is the way to go.

Doing so gives me the right result for the clone´s bounding box, no matter what the order of the elements within the xml file is. However, calculating the bounding box of the original element is influenced by the existance of a clone below the element.

Even though my results seem to be correct, the issue with the incorrect bounding box seems to be persistant.

import svgelements

# svgFileName = "path/to/svg"

layout = svgelements.SVG.parse(
    source = svgFileName, 
    reify = True,
    ppi = svgelements.svgelements.DEFAULT_PPI,
    width = 1,
    height = 1,
    color = "black",
    transform = None,
    context = None
)

template = layout.get_element_by_id("rect2728")
cloneElement = layout.get_element_by_id("use2810")

transformMatrixTemplate = svgelements.Matrix(template.values["transform"])
transformMatrixClone = svgelements.Matrix(cloneElement.values["transform"])

clone = template * transformMatrixTemplate.inverse() * transformMatrixClone
[round(value, 3) for value in template.bbox()]
[round(value, 3) for value in clone.bbox()]

# Result is incorrect; happens when `rect` comes BEFORE `use`
# [40.0, 120.0, 1880.0, 200.0]
# [40.0, 120.0, 1880.0, 200.0]

# Result is correct; happens when `rect` comes AFTER `use`
# [40.0, -160.0, 1880.0, -80.0]
# [40.0, 120.0, 1880.0, 200.0]
baender commented 2 years ago

Finally, I decided to some systematic testing to see what actually happens. The Inkscape SVG file contains four layers (path, rect, group, text), each with an element of that type together with a clone.

import svgelements

# svgFileName = "path/to/svg"

layout = svgelements.SVG.parse(
    source = svgFileName, 
    reify = True,
    ppi = svgelements.svgelements.DEFAULT_PPI,
    width = 1,
    height = 1,
    color = "black",
    transform = None,
    context = None
)

[element.id for element in layout.elements()]

# IDs for each element
# ['svg20', 'namedview22', 'grid857', 'layer1', 'path', 'path_clone', 'path', 'layer2', 'rect', 'rect_clone', 'rect', 'layer3', 
# 'group', 'path6', 'path8', 'group_clone', 'group', 'path6', 'path8', 'layer4', None, None, 'text_clone', None, None, 
# 'metadata1091', None, None]

It seems that the xlink:href="#..." attribute copies the reference element into the clone element, where the id is found again. A sign for that are the repeating patterns path, path_clone, path, rect, rect_clone, rect and group, path6, path8, group_clone, group, path6, path8 and None, None, text_clone, None, None (there seems to be a problem when parsing Inkscape text elements, I would open another issue to that)

To look a bit more into deteail I chose the path elements as follows:

pathElements = [element.values for element in layout.elements() if element.id in ["path", "path_clone"]]
len(pathElements)
# 3

reference = pathElements[0]
cloneSVGElement = pathElements[1]
ghost = pathElements[2]

The reference looks as it should be (from what I can tell)

{
    "": "http://www.w3.org/2000/svg",
    "{http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd}docname": "test.svg",
    "{http://www.inkscape.org/namespaces/inkscape}groupmode": "layer",
    "{http://www.inkscape.org/namespaces/inkscape}label": "Path_Layer",
    "{http://www.inkscape.org/namespaces/inkscape}version": "1.1.1 (3bf5ae0d25, 2021-09-20)",
    "attributes": {
        "d": "m26.458 26.458v84.667h79.375v-84.667z",
        "fill": "#f00",
        "fill-rule": "evenodd",
        "id": "path",
        "stroke-linecap": "round",
        "stroke-width": ".18257",
        "tag": "path"
    },
    "cc": "http://creativecommons.org/ns#",
    "color": "black",
    "d": "m26.458 26.458v84.667h79.375v-84.667z",
    "fill": "#f00",
    "fill-rule": "evenodd",
    "height": "1080",
    "id": "path",
    "inkscape": "http://www.inkscape.org/namespaces/inkscape",
    "pathd_loaded": true,
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "sodipodi": "http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd",
    "stroke": "none",
    "stroke-linecap": "round",
    "stroke-width": ".18257",
    "svg": "http://www.w3.org/2000/svg",
    "tag": "path",
    "transform": "scale(3.779527559055, 3.779527559055)",
    "version": "1.1",
    "width": "1920",
    "xlink": "http://www.w3.org/1999/xlink"
}

The clone element has additional references {http://www.w3.org/1999/xlink}href. The translate-transform gets a (viewbox) scale-transform appended (right to left) to get the correct distances. The inverse viewbox scale-transform must be applied to the reference before applying this transformation in order to get the correct bounding box, otherwise, the viewbox scale-transform would be applied twice (once to the reference, once to the clone). This is also true for all other objects (path, rect, group)

{
    "": "http://www.w3.org/2000/svg",
    "{http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd}docname": "test.svg",
    "{http://www.inkscape.org/namespaces/inkscape}groupmode": "layer",
    "{http://www.inkscape.org/namespaces/inkscape}label": "Path_Layer",
    "{http://www.inkscape.org/namespaces/inkscape}version": "1.1.1 (3bf5ae0d25, 2021-09-20)",
    "{http://www.w3.org/1999/xlink}href": "#path",
    "attributes": {
        "{http://www.w3.org/1999/xlink}href": "#path",
        "height": "100%",
        "id": "path_clone",
        "tag": "use",
        "transform": "scale(3.779527559055, 3.779527559055) translate(26.458,132.29) translate(0, 0)",
        "width": "100%"
    },
    "cc": "http://creativecommons.org/ns#",
    "color": "black",
    "fill": "black",
    "height": "100%",
    "id": "path_clone",
    "inkscape": "http://www.inkscape.org/namespaces/inkscape",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "sodipodi": "http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd",
    "stroke": "none",
    "svg": "http://www.w3.org/2000/svg",
    "tag": "use",
    "transform": "scale(3.779527559055, 3.779527559055) translate(26.458,132.29) translate(0, 0)",
    "version": "1.1",
    "width": "100%",
    "xlink": "http://www.w3.org/1999/xlink"
}

The third (ghost) element is again the path but messed up. It has a reference (but not in attributes) and the transform is the same as for the clone (despite some other differences). Applying this transformation would give the correct bounding box, but only by coincidence, as the original path is part of the element. When doing this with a group element, values are wrong.

{
    "": "http://www.w3.org/2000/svg",
    "{http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd}docname": "test.svg",
    "{http://www.inkscape.org/namespaces/inkscape}groupmode": "layer",
    "{http://www.inkscape.org/namespaces/inkscape}label": "Path_Layer",
    "{http://www.inkscape.org/namespaces/inkscape}version": "1.1.1 (3bf5ae0d25, 2021-09-20)",
    "{http://www.w3.org/1999/xlink}href": "#path",
    "attributes": {
        "d": "m26.458 26.458v84.667h79.375v-84.667z",
        "fill": "#f00",
        "fill-rule": "evenodd",
        "id": "path",
        "stroke-linecap": "round",
        "stroke-width": ".18257",
        "tag": "path"
    },
    "cc": "http://creativecommons.org/ns#",
    "color": "black",
    "d": "m26.458 26.458v84.667h79.375v-84.667z",
    "fill": "#f00",
    "fill-rule": "evenodd",
    "height": "100%",
    "id": "path",
    "inkscape": "http://www.inkscape.org/namespaces/inkscape",
    "pathd_loaded": true,
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "sodipodi": "http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd",
    "stroke": "none",
    "stroke-linecap": "round",
    "stroke-width": ".18257",
    "svg": "http://www.w3.org/2000/svg",
    "tag": "path",
    "transform": "scale(3.779527559055, 3.779527559055) translate(26.458,132.29) translate(0, 0)",
    "version": "1.1",
    "width": "100%",
    "xlink": "http://www.w3.org/1999/xlink"
}

SVG File

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->

<svg
   width="1920"
   height="1080"
   version="1.1"
   viewBox="0 0 508 285.75"
   id="svg20"
   sodipodi:docname="test.svg"
   inkscape:version="1.1.1 (3bf5ae0d25, 2021-09-20)"
   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
   xmlns:xlink="http://www.w3.org/1999/xlink"
   xmlns="http://www.w3.org/2000/svg"
   xmlns:svg="http://www.w3.org/2000/svg"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:cc="http://creativecommons.org/ns#">
  <defs
     id="defs24" />
  <sodipodi:namedview
     id="namedview22"
     pagecolor="#ffffff"
     bordercolor="#666666"
     borderopacity="1.0"
     inkscape:pageshadow="2"
     inkscape:pageopacity="0.0"
     inkscape:pagecheckerboard="0"
     showgrid="true"
     inkscape:zoom="0.46109255"
     inkscape:cx="901.12062"
     inkscape:cy="567.13127"
     inkscape:window-width="1920"
     inkscape:window-height="1001"
     inkscape:window-x="-9"
     inkscape:window-y="-9"
     inkscape:window-maximized="1"
     inkscape:current-layer="layer4">
    <inkscape:grid
       type="xygrid"
       id="grid857" />
  </sodipodi:namedview>
  <g
     inkscape:groupmode="layer"
     id="layer1"
     inkscape:label="Path_Layer">
    <path
       id="path"
       d="m26.458 26.458v84.667h79.375v-84.667z"
       fill="#f00"
       fill-rule="evenodd"
       stroke-linecap="round"
       stroke-width=".18257" />
    <use
       id="path_clone"
       transform="translate(26.458,132.29)"
       width="100%"
       height="100%"
       xlink:href="#path"
       x="0"
       y="0" />
  </g>
  <g
     inkscape:groupmode="layer"
     id="layer2"
     inkscape:label="Rect_Layer">
    <rect
       id="rect"
       x="132.29"
       y="52.917"
       width="121.71"
       height="63.5"
       fill="#ff0"
       fill-rule="evenodd"
       stroke-linecap="round"
       stroke-width=".2" />
    <use
       id="rect_clone"
       transform="translate(26.458,158.75)"
       width="100%"
       height="100%"
       xlink:href="#rect"
       x="0"
       y="0" />
  </g>
  <g
     inkscape:groupmode="layer"
     id="layer3"
     inkscape:label="Group_Layer">
    <g
       id="group"
       transform="translate(317.5,26.458)">
      <path
         d="m0-7.5694e-7v31.75h132.29v-31.75z"
         fill="#0f0"
         fill-rule="evenodd"
         stroke-linecap="round"
         stroke-width=".2"
         id="path6" />
      <path
         d="m-1e-5 42.333v31.75h132.29v-31.75z"
         fill="#0f0"
         fill-rule="evenodd"
         stroke-linecap="round"
         stroke-width=".2"
         id="path8" />
    </g>
    <use
       id="group_clone"
       transform="translate(26.458,158.75)"
       width="100%"
       height="100%"
       xlink:href="#group"
       x="0"
       y="0" />
  </g>
  <g
     inkscape:groupmode="layer"
     id="layer4"
     inkscape:label="Text_Layer">
    <text
       xml:space="preserve"
       style="font-size:18px;line-height:1.25;font-family:sans-serif;stroke-width:0.264583"
       x="183.44173"
       y="26.904947"
       id="text"><tspan
         sodipodi:role="line"
         id="tspan2565"
         style="stroke-width:0.264583"
         x="183.44173"
         y="26.904947">Lorem ipsum</tspan></text>
    <use
       x="0"
       y="0"
       xlink:href="#text"
       id="text_clone"
       transform="translate(79.375005,119.0625)"
       width="100%"
       height="100%" />
  </g>
  <metadata
     id="metadata1091">
    <rdf:RDF>
      <cc:Work
         rdf:about="" />
    </rdf:RDF>
  </metadata>
</svg>
tatarize commented 2 years ago

I loaded the elements up in meerk40t which uses svgelements as the loading library with rect_use.svg and use.rect.svg.

Rect then Use rect_use

Use then Rect use_rect

As you can see the problem isn't the bounding box but a combination of two factors. First, use before rect doesn't work in svgelements. If it's not set in the defs beforehand or already used it doesn't exist in the shadow dom. This is a consequence of rendering and parsing the data at the same time. It has no understanding of future values. The same is true for CSS being applied to the svg. If the CSS occurs after the elements it's not applied retroactively.

Secondly the use object copies all the attributes, including the id to be stored in the lookup. This overwrites the rect element in the second case resulting in the two different values for the rectangles' bounding box. In the case that rect occurs before use, both exist and you get the use element. In the case that use occurs before rect, you do not get the use to exist at all and we only have the rect. In the first case you get the bounds of the use. In the second case you get the bounds of the rect. These bounds do not match.

I need to double check the spec to see what it says about this and see if I need to do something to permit id to inherit/not-inherit in this case.

tatarize commented 2 years ago

Okay, I checked and the error is that the defs which get defined in the _use_structure_parse which does a sort of use parsing tree that gives me the tree as if use values were directly present, so that the regular parsing tree work can ignore them, are the only values that should go in the lookup.

Since this also ends up rendering the objects the defs are just shadow node values and not rendered objects. And the renders only exist in the regular parse tree, so links to the objects for the lookup were done in the regular parse tree. This means that I can and do end up rendering the same rectangle twice fed by shadow nodes. But, these can have identical ids.

The error here is that the lookup should only ever store id-link of the first value and should never update these values for the lookup. Anything with the same id is going to be caused by a use value occuring after a real object since I don't pre-parse the entire tree for shadow references. That is unless I should do this. It wouldn't be that hard to parse the entire tree first then have access to future nodes values. Though I seem to recall there were some ways to get infinite loops then and other specialty methods to avoid that.

The easiest is to not permit future-use and not update the lookup if we see the same id a second time. This seems like future use events would want to be processed qua #87 as part of a fuller DOM-based project.

tatarize commented 2 years ago

I expanding the use parsing a bit in the PR it's still failing the WC3 test for use, but it's at least loading all the elements and permits use of use objects. I might have to double-check against some infinite loop issues.

tatarize commented 1 year ago

Fixed a bit ago.