KWARC / rust-libxml

Rust wrapper for libxml2
https://crates.io/crates/libxml
MIT License
76 stars 38 forks source link

Allow for higher Rc count mutability treshold #38

Closed dginev closed 6 years ago

dginev commented 6 years ago

As it so happens, one of my private projects using the grate hit an error with an Rc strong count of 5 (five)! Which was shocking.

One of the techniques I use there is legitimate - I am similarly bookkeeping nodes as a "current insertion point" during a document construction traversal that keeps absorbing content and moving up-and-down the tree as appropriate.

So I would expect an Rc strong count of at least 3 to be acceptable. And possibly even 5 with the specific function in question. The code is pretty directly ported from Perl's use of Nodes in XML::LibXML, and naturally Perl allows the mutability conflicts without any concern. I could try to completely rework that bit of code, and that may be a good idea for new projects, but for now I would prefer to have the simple option of raising the threshold.

Time for a quick 0.2.1 ?

dginev commented 6 years ago

Won't release a 0.2.1 yet, since it's unclear this upgrade is urgent to anyone else.

Also, the level of Rc tolerance I needed was a strong count of 80 !! Which indicates I am leaking Rc clones left and right in that third-party crate. So I will be reworking the Node-related code in it soon, probably best to do in any case since those are a lot of wasted references.

dginev commented 6 years ago

Fascinating, there was indeed a buggy leak of Rc's in my code, that I still don't fully understand, but have at least localized to this snippet of code:

      let mut results = Vec::new();
      let newly_created: Vec<Node> = self.constructed_nodes.drain(0..).collect(); 
      for node in &newly_created {
        self.record_constructed_node(Some(&node));
      }
      results.extend(newly_created);

where the call to self.record_constructed_node performs a node.clone() internally, and the constructed_nodes vector contains those nodes.

The interesting bit is that even if I don't return the results vector and it gets dropped at the end of the function call, the 80+ Rc counts still leak out. So I am wondering if this type of setup doesn't produce a circular reference of some kind, which is known to make it impossible to deallocate the Rc.

All in all, the 0.2.0 upgrade found a real leaky bug in my code, so definitely happy about things!!