Process Dom and CSS correctly as part of a document-structure.

tatarize commented 3 years ago

This is not for svgelements 1.x.x rather it would be a rather massive shift and may require an entirely new project depending on a variety of factors. The idea would be to parse the SVG dom and css independently of the SVG itself. This would require applying the dom and css rules as part of a proper document class. This could be built as well as parsed from a document. The dom nodes would be independent of and different than individual svg elements. This would process the SVG significantly more like svgwrite runs their svg code. Where the CSS rules could be altered or changed and it would have a direct effect on the document which would be parsed in a lazy fashion.

This would generalize the documents into nodes and css and facilitate building documents qua svgwrite but would also parse the documents initially. This would work equally well on any xml-like structures or even HTML data. Then the svgelements like code would simply be the creation of first order elements out of this data with direct links back into the Document structures.

Determining a particular path's transformation would then be done through rendering, which is currently more or less integrated into the parse() functionality. You could round-trip an svg document, loading all the data, and modifying some CSS or applying a change to the document and saving it back out. Or you could extract the actual paths themselves, and modify the data applied to the document.

There is some chance this could be properly executed within the svgelements 1.x.x scheme without massive breaking changes. We would introduce a Document class which would contain some svgs as needed and various dom nodes. This data would register the nodes, and still modify them. Record correct style sheet information (style sheet information can be applied after elements are parsed and should apply correctly), and apply the full style sheet rules or add your own css style sheet rules. The second phase of parse would be to turn these nodes into fully fledged svg objects. In effect the document would be parsed in memory.

Document.load() would load the document fully without requiring any information or SVG specific data. We would have the dom nodes, as well as the css data. We could then modify this data according to the standard APIs that are available to javascript within a normal dom environment. And we could even code up html webpage etc with this kind of structure. No effort would be made to interpret this data.
We would have capabilities to build these nodes like with svgwrite. These would generally be decorations on nodes like svgwrite currently supports.
Document.render() would apply the standard information to the svg document that cannot be determined from the document data. The desired PPI, the language. This would be akin to the current functionality of SVG.parse() and create actual fully rendered objects. We would then also have a similar method to .elements() which returns actually understood svg elements.

While this might be out of scope for what svgelements currently does, there's some rather noteworthy ease to the project as a whole. And it would largely encompass the powers of svgwrite with parsing and processing and rendering.

tatarize commented 3 years ago

Most of the attributes are implementing CSS like lookups and locations. This should be fairly easily implemented within a Node/Tree structure. The NS() suffixed commands can just take an optional namespace argument.

Node Attributes

baseURI: read-only property returns the absolute base URL of a Node. childNodes: read-only property returns a live NodeList of child nodes of the given element where the first child node is assigned index 0. Child nodes include elements, text and comments.
firstChild: read-only property returns the node's first child in the tree, or null if the node has no children. If the node is a Document, it returns the first node in the list of its direct children.
isConnected: read-only property of the Node interface returns a boolean indicating whether the node is connected (directly or indirectly) to the context object, for example the Document object in the case of the normal DOM, or the ShadowRoot in the case of a shadow DOM.
lastChild: read-only property returns the last child of the node. If its parent is an element, then the child is generally an element node, a text node, or a comment node. It returns null if there are no child elements.
localName
namespaceURI
nextSibling read-only property returns the node immediately following the specified one in their parent's childNodes, or returns null if the specified node is the last child in the parent element.
nodeName read-only property returns the name of the current Node as a string. nodeType: property is an integer that identifies what the node is. It distinguishes different kind of nodes from each other, such as elements, text and comments.
nodeValue: property of the Node interface returns or sets the value of the current node.
ownerDocument: read-only property of the Node interface returns the top-level document object of the node.
parentElement: read-only property returns the DOM node's parent Element, or null if the node either has no parent, or its parent isn't a DOM Element.
parentNode: read-only property returns the parent of the specified node in the DOM tree.
previousSibling read-only property returns the node immediately preceding the specified one in its parent's childNodes list, or null if the specified node is the first in that list.
textContent: property of the Node interface represents the text content of the node and its descendants.

Node Methods

appendChild() method adds a node to the end of the list of children of a specified parent node. If the given child is a reference to an existing node in the document, appendChild() moves it from its current position to the new position (there is no requirement to remove the node from its parent node before appending it to some other node).
cloneNode(): method returns a duplicate of the node on which this method was called.
contains(): method returns a Boolean value indicating whether a node is a descendant of a given node, i.e. the node itself, one of its direct children (childNodes), one of the children's direct children, and so on.
getRootNode(): method of the Node interface returns the context object's root, which optionally includes the shadow root if it is available.
hasChildNodes(): method returns a Boolean value indicating whether the given Node has child nodes or not.
insertBefore(): method inserts a node before a reference node as a child of a specified parent node.
isDefaultNamespace(): method accepts a namespace URI as an argument and returns a Boolean with a value of true if the namespace is the default namespace on the given node or false if not.
isEqualNode(): method tests whether two nodes are equal. Two nodes are equal when they have the same type, defining characteristics (for elements, this would be their ID, number of children, and so forth), its attributes match, and so on. The specific set of data points that must match varies depending on the types of the nodes.
isSameNode(): method for Node objects tests whether two nodes are the same (that is, whether they reference the same object). lookupNamespaceURI(): method accepts a prefix and returns the namespace URI associated with it on the given node if found (and null if not).
lookupPrefix(): method returns a DOMString containing the prefix for a given namespace URI, if present, and null if not. When multiple prefixes are possible, the result is implementation-dependent.
normalize(): method puts the specified node and all of its sub-tree into a "normalized" form. In a normalized sub-tree, no text nodes in the sub-tree are empty and there are no adjacent text nodes.
removeChild(): method removes a child node from the DOM and returns the removed node.
replaceChild(): method replaces a child node within the given (parent) node.

Element Methods

after() : Insert a new node after this in the sibling tree.
animate(): Shortcut to return an animation routine. (This is technically possible).
append(): Append a new node to this node or dom string.
attachShadow(): method attaches a shadow DOM tree to the specified element and returns a reference to its ShadowRoot.
before(): method inserts a set of Node or DOMString objects in the children list of this ChildNode's parent, just before this ChildNode. DOMString objects are inserted as equivalent Text nodes.
closest(): method traverses the Element and its parents (heading toward the document root) until it finds a node that matches the provided selector string. Will return itself or the matching ancestor. If no such element exists, it returns null.
getAttribute(): method of the Element interface returns the value of a specified attribute on the element. If the given attribute does not exist, the value returned will either be null or "" (the empty string); see Non-existing attributes for details.
getAttributeNames(): method of the Element interface returns the attribute names of the element as an Array of strings. If the element has no attributes it returns an empty array.
getAttributeNS(): method of the Element interface returns the string value of the attribute with the specified namespace and name. If the named attribute does not exist, the value returned will either be null or "" (the empty string); see Notes for details.
getBoundingClientRect(): method returns a DOMRect object providing information about the size of an element and its position relative to the viewport.
getClientRects(): method of the Element interface returns a collection of DOMRect objects that indicate the bounding rectangles for each CSS border box in a client.
getElementsByClassName(): method getElementsByClassName() returns a live HTMLCollection which contains every descendant element which has the specified class name or names.
getElementsByTagName(): method returns a live HTMLCollection of elements with the given tag name. All descendants of the specified element are searched, but not the element itself. The returned list is live, which means it updates itself with the DOM tree automatically. Therefore, there is no need to call Element.getElementsByTagName() with the same element and arguments repeatedly if the DOM changes in between calls.
getElementsByTagNameNS(): method returns a live HTMLCollection of elements with the given tag name belonging to the given namespace. It is similar to Document.getElementsByTagNameNS, except that its search is restricted to descendants of the specified element.
hasAttribute(): method returns a Boolean value indicating whether the specified element has the specified attribute or not.
hasAttributeNS(): returns a boolean value indicating whether the current element has the specified attribute.
hasAttributes(): method of the Element interface returns a Boolean indicating whether the current element has any attributes or not.
insertAdjacentElement(): method of the Element interface inserts a given element node at a given position relative to the element it is invoked upon.
insertAdjacentHTML(): method of the Element interface parses the specified text as HTML or XML and inserts the resulting nodes into the DOM tree at a specified position. It does not reparse the element it is being used on, and thus it does not corrupt the existing elements inside that element. This avoids the extra step of serialization, making it much faster than direct innerHTML manipulation.
insertAdjacentText(): method of the Element interface inserts a given text node at a given position relative to the element it is invoked upon.
matches(): method checks to see if the Element would be selected by the provided selectorString -- in other words -- checks if the element "is" the selector.
prepend(): method inserts a set of Node objects or DOMString objects before the first child of the ParentNode. DOMString objects are inserted as equivalent Text nodes.
querySelector(): method of the Element interface returns the first element that is a descendant of the element on which it is invoked that matches the specified group of selectors.
querySelectorAll(): method querySelectorAll() returns a static (not live) NodeList representing a list of elements matching the specified group of selectors which are descendants of the element on which the method was called.
removeAttribute(): method removeAttribute() removes the attribute with the specified name from the element.
removeAttributeNS(): method of the Element interface removes the specified attribute from an element.
replaceChildren(): method replaces the existing children of a Node with a specified new set of children. These can be DOMString or Node objects.
replaceWith(): method replaces this ChildNode in the children list of its parent with a set of Node or DOMString objects. DOMString objects are inserted as equivalent Text nodes.
setAttribute(): Sets the value of an attribute on the specified element. If the attribute already exists, the value is updated; otherwise a new attribute is added with the specified name and value.
setAttributeNodeNS(): setAttributeNodeNS adds a new namespaced attribute node to an element.
setAttributeNS(): adds a new attribute or changes the value of an attribute with the given namespace and name.
toggleAttribute(): method of the Element interface toggles a Boolean attribute (removing it if it is present and adding it if it is not present) on the given element.

Element Attributes

attributes: property returns a live collection of all attribute nodes registered to the specified node. It is a NamedNodeMap, not an Array, so it has no Array methods and the Attr nodes' indexes may differ among browsers. To be more specific, attributes is a key/value pair of strings that represents any information regarding that attribute.
childElementCount: read-only property returns an unsigned long representing the number of child elements of the given element.
children property children is a read-only property that returns a live HTMLCollection which contains all of the child elements of the node upon which it was called.
classList is a read-only property that returns a live DOMTokenList collection of the class attributes of the element. This can then be used to manipulate the class list. className property of the Element interface gets and sets the value of the class attribute of the specified element.
clientHeight read-only property is zero for elements with no CSS or inline layout boxes; otherwise, it's the inner height of an element in pixels. It includes padding but excludes borders, margins, and horizontal scrollbars (if present).
clientLeft The width of the left border of an element in pixels. It includes the width of the vertical scrollbar if the text direction of the element is right–to–left and if there is an overflow causing a left vertical scrollbar to be rendered. clientLeft does not include the left margin or the left padding. clientLeft is read-only.
clientTop The width of the top border of an element in pixels. It is a read-only, integer property of element.
clientWidth property is zero for inline elements and elements with no CSS; otherwise, it's the inner width of an element in pixels. It includes padding but excludes borders, margins, and vertical scrollbars (if present).
currentStyle read-only property returns the object's first child Element, or null if there are no child elements.
firstElementChild read-only property returns the object's first child Element, or null if there are no child elements.
id The id property of the Element interface represents the element's identifier, reflecting the id global attribute.
innerHTML = property innerHTML gets or sets the HTML or XML markup contained within the element.
lastElementChild read-only property returns the object's last child Element or null if there are no child elements.
localName read-only property returns the local part of the qualified name of an element.
namespaceURI: read-only property returns the namespace URI of the element, or null if the element is not in a namespace.
nextElementSibling: read-only property returns the element immediately following the specified one in its parent's children list, or null if the specified element is the last one in the list.
outerHTML attribute of the Element DOM interface gets the serialized HTML fragment describing the element including its descendants. It can also be set to replace the element with nodes parsed from the given string.
part property of the Element interface represents the part identifier(s) of the element (i.e. set using the part attribute), returned as a DOMTokenList. These can be used to style parts of a shadow DOM, via the ::part pseudo-element. prefix read-only property returns the namespace prefix of the specified element, or null if no prefix is specified.
previousElementSibling - read-only property returns the Element immediately prior to the specified one in its parent's children list, or null if the specified element is the first one in the list.
shadowRoot - read-only property represents the shadow root hosted by the element. Use Element.attachShadow() to add a shadow root to an existing element.
tagName - read-only property of the Element interface returns the tag name of the element on which it's called. For example, if the element is an , its tagName property is "IMG" (for HTML documents; it may be cased differently for XML/XHTML documents).

CSS: https://www.w3schools.com/cssref/css_selectors.asp

tatarize commented 3 years ago

See: https://tinycss.readthedocs.io/en/latest/index.html https://svgwrite.readthedocs.io/en/latest/

tatarize commented 3 years ago

https://www.w3.org/TR/CSS22/syndata.html Has the syntax data. Lexicographical parsers aren't too hard. The scheme for the svg path ( https://github.com/meerk40t/svgelements/blob/d5615435b41f7a3a4f57ec490036c9321c9435cd/svgelements/svgelements.py#L210 ) would work to parse most bnf like grammars. Doing this would largely duplicate the work of https://github.com/Kozea/tinycss/blob/master/tinycss/css21.py

The general sequence of events to get to a more usable codeset would likely fit with very similar functionality to modern browsers: https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/

The methods for replacing a node with the subclass node type defined is within the DOM spec: https://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#DOMImplementationSource

tatarize commented 3 years ago

Reading through the functionality of modern browsers and checking with some javascript. The DOM content tree shouldn't be too hard to implement. Node objects, their relationships, CSS, and styles are applied to the object in question and are not cascaded or precomputer except during rendering. The goal then would be to create a correct content tree without rendering, and then permit rendering.

class RenderObject{
  virtual void layout();
  virtual void paint(PaintInfo);
  virtual void rect repaintRect();
  Node* node;  //the DOM node
  RenderStyle* style;  // the computed style
  RenderLayer* containgLayer; //the containing z-index layer
}

Or at least doing it in a way to calculate the computed style and making that data able to access the internal geometry of the various shapes as needed after rendering. Either attached or directly as part of the dom nodes. For example if you disable the "visibility" the correct corresponding rendered node is None. This node does not render. But, it would still be appropriate and acceptable to have a dom node exist that could enable the rendering thus forcing a re-render of the children which would render them. Also, if you directly appended text to dom node that contained a rectangle, that would likely be bootstrapped to find the Rect object and apply it correctly placed in the dom.

Implementing DOM and CSS as part of a document structure would basically allow for loading of svgwrite like classes where each of the relevant classes would be loaded up inplace for the DOM. Then a different render routine would apply the css, cascade the styles, and calculate the geometry. The render tree is not the same as the DOM tree. Though it would be possible to have the rendered objects reference their DOM counterpart but the DOM version cannot have the full rendered styles since they could have a non 1:1 relationship. DOM nodes marked visibility=none either on themselves or their parents do not exist in the render tree. If a node part of a use object it could result in many copies existing in the rendered tree. And these copies would have different renderedStyles. svgwrite nodes represent the DOM nodes, whereas svgelements objects are explicitly the rendered nodes.

It would seem as though the best first step might be implementing DOMCSS at least a bit, then bootstrapping svgwrite's DOM nodes into those places. This would ostensibly add loading to svgwrite classes. Then some magic happens. And can also generate relevant geometric information from this data in a useful way that could inform changes we'd like to make to the DOM values.

The resulting path we may have from rendering could be in part a transformation on a group parent, which would mean simply writing the altered node back into DOMPath would result in 2x application of that transformation. Though in theory you could actually push the points you changed back through through the inverse matrix of the styled change and result in a changed original path that would be correct. Likewise changing any colors could simply be applied to the local style of the DOM node which would force those relevant colors irrespective of the original renderStyle. For most other things you'd want the renderedStyles to exist but not be editable, since there are some cases like children of use that couldn't actually have their local style object changed because doing so would change every single copied object. Rather the use object would need to be replaced in the DOM with the object itself. Though these modifications being reflected back into the dom tree, would be a broad subject they could be deferred as various features later on. You'd have access to the DOM tree and the render tree, and you could directly save the DOM tree again or perhaps even just save the flattened render tree if that serves your needs. Which would fit the bulk of the project.

tatarize commented 3 years ago

There is considerable overlap with this and xml dom objects. With bootstrapping and some rendering code these could be made more closely related. For example implementing a core fusion of svgwrite and xml.dom such that it would perform correct bootstrapping of tagged objects.

https://docs.python.org/3/library/xml.dom.html#module-xml.dom

We could then read and write documents with svgwrite-like correctness. Then implementations of the CSS Length, Angle, Matrix etc. classes and we an implementation of stylesheets, rendering, and geometry might be reasonable from there.

tatarize commented 3 years ago

While xml.dom is a faithful dom implementation the lack of bootstrapping or other common dom related CSS makes it a needlessly cumbersome method of accessing xml. So most examples relate to minidom and even to element-tree since without any advantages of bootstrapping this is a bunch of rather strange functions that do not actually provide any utility to the end user.

Correctly done, loading a dom should bootstrap and load svgwrite like objects within the dom and permit the underlying required interplay. Such that the tagged rect object is actually the same as a rect within svgwrite-like dom formatting. Then loading and saving could be correctly achieved. And perhaps providing rendering and rendered objects both cached and on the fly. Probably flattened and in the correct render order.

This rendered information would then be able to be utilized to update the dom based on the rendered result. However, without bootstrapping it might be best to simply emulate xml.dom or xml.minidom without actually implementing them. Start with adding reading ability to svgwrite and then rendering ability to svgwrite.

meerk40t / svgelements

Process Dom and CSS correctly as part of a document-structure. #87