Open Lukas-Stuehrk opened 3 years ago
In its current state you need to call document.remove(child: heading)
in order to detach heading
from the document, turning it into an independent managed node. Or let children = node.removeChildren()
to detach all children of a node at once, effectively unwrapping it.
I'd say that this is a very common use case of this library. E.g. when you try to upgrade SwiftMarkup to use CommonMark version 0.5.0, then you will run into this crash.
Thanks for pointing this out, @Lukas-Stuehrk. @regexident is correct about the correct resolution, which I followed for this PR updating SwiftMarkup to the latest release of CommonMark.
I'd happy to provide a fix for this, but I think we should discuss the approach first. If I'm not mistaken, this is somehow a design flaw in the parent to child relation and a simple
managed
flag is not enough to represent the relationship.
The managed
flag is an admittedly primitive tool for reference management. Did you have another approach in mind?
I'm sorry, I got confused by the changes between 0.4.0 and 0.5.0 and wrongfully assumed it's a bug. I can see that the current behavior is well documented. My bad.
Nevertheless, I think the current behavior can be rather surprising. And it does not feel very swifty. As a user of the library, you always need to keep a mental model of the lifetime of the nodes, otherwise you get a dangling pointer. Basically you can get a dangling pointer by accessing a property. This does not feel right.
In my point of view, a library should make it very easy to use it correctly and very hard to use incorrectly. I think the current state of things was a good step to solve the memory leak problems, but it makes it also very easy to mistakenly use the API.
I can see a couple of different approaches:
Node
instances of children are always created ad-hoc when accessing children. We could create all instances of child nodes upfront and then keep them on the parent. Then, when accessing properties, you will always get the same node instance. Then we would move the lifetime problem to the lifetime of the Swift objects and always free the cmark_node
when the Swift Node
is deinited.
Then you could work with it like you would with DOM. When you keep a reference to a node, this also keeps the rest of the hierarchy in memory and traversing the tree works like you expect. If you don't want to keep a node in it's hierarchy, you would need to unlink it. But you could never create a dangling pointer and you don't have leaks. The memory model is easy to understand, because it's Swift's memory model.
Depending on the use case, this could even be a performance improvement because the nodes are not constantly recreated when walking the node tree. But if you walk the tree only once, then the performance could be worse.OpaquePointer
as identifier (it's Hashable
) and increase a counter when a Node
for it is created and decrease when a Node
for it is deallocated. But it would require some global state. And a lot more thought, because we would also need to consider the hierarchy. I don't know how feasible this is. And it needs some thoughts when nodes get detached..children
, but make it more clear that you should not access and store the children without detaching. Or even make it more complicated by design to work with them, e.g. you need to pass-in a closure that then gets the children and we document it clearly that the children should not escape the closure.However, all of this also sounds like a lot of work. So maybe we should simply add better user guides and keep the current state 😄.
@Lukas-Stuehrk I 100% agree with you. The fix we went for (for now) was something we could do "today". It's far from good, but it at least allows you to use the API safely. Before you couldn't really.
I'm not entirely sure if it's feasible to safely tack new improved semantics onto the underlying memory-model of cmark's heap-allocated C pointers. I might be mistaken though and there is a clean and efficient way. In fact I very much hope so.
We fixed a memory leak that existed for a long time, when we merged #22. Unfortunately, this introduced a new issue: When you create a document, we now mark it as
managed
. When you then reference a child node of the document and no longer keep a reference to the document itself, we will free the node for the document: https://github.com/SwiftDocOrg/CommonMark/blob/main/Sources/CommonMark/Nodes/Node.swift#L35 But this also frees all the child nodes and we suddenly have a use after free when we still have a reference to any child.Minimal test case to reproduce:
Before, this was not happening, because our document node was always leaking, therefore "preventing" this crash 😄.
I'd say that this is a very common use case of this library. E.g. when you try to upgrade SwiftMarkup to use
CommonMark
version0.5.0
, then you will run into this crash.I'd happy to provide a fix for this, but I think we should discuss the approach first. If I'm not mistaken, this is somehow a design flaw in the parent to child relation and a simple
managed
flag is not enough to represent the relationship.