KWARC / rust-libxml

Rust wrapper for libxml2
https://crates.io/crates/libxml
MIT License
76 stars 38 forks source link

Cargo test locking up #107

Open o087D opened 1 year ago

o087D commented 1 year ago

I have come across an issue with cargo locking up, I expect due to some race condition? The following code exhibits the behavior on my local development machine and on our CI servers:

use libxml::parser::Parser;
use libxml::schemas::{SchemaParserContext, SchemaValidationContext};

fn main() {
    println!("Hello, world!");
}

pub fn do_xml_things(schema: &str, xml: &str) {
    let mut schema_parser = SchemaParserContext::from_buffer(schema);
    let mut xsd = SchemaValidationContext::from_parser(&mut schema_parser).unwrap();

    let doc = Parser::default().parse_string(xml).unwrap();
    xsd.validate_document(&doc).unwrap();
}

#[cfg(test)]
mod test {

    use super::*;

    const SCHEMA: &str = r#"<?xml version="1.0"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note">
      <xs:complexType>
        <xs:sequence>
          <xs:element name="to" type="xs:string"/>
          <xs:element name="from" type="xs:string"/>
          <xs:element name="heading" type="xs:string"/>
          <xs:element name="body" type="xs:string"/>
        </xs:sequence>
      </xs:complexType>
    </xs:element>

    </xs:schema>"#;

    #[test]
    fn test_a() {
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }

    #[test]
    fn test_b() {
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }

    #[test]
    fn test_c() {
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }

    #[test]
    fn test_d() {
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }

    #[test]
    fn test_e() {
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }
}

Running cargo test sometimes (1 in 3 times for me) results in all the tests locking up. Running RUST_TEST_THREADS=1 cargo test removes the issue altogether.

I will implement setting the tests as a workaround, but I am not sure if this is an underlying libxml2 issue, or something specific to the crate.

Thoughts?

dginev commented 1 year ago

To first exclaim an obvious "no warranty" remark, the front of the wrapper's README curently states:

No thread safety - libxml2's global memory management is a challenge to adapt in a thread-safe way with minimal intervention

If you have already tried RUST_TEST_THREADS=1 and the error disappears, then this is likely one of the (possibly many) ways in which parallel execution can currently bite when using this wrapper.

If it becomes valuable to use in a thread-safe way, we would need to put in more work to make that safe and reasonable...

o087D commented 1 year ago

In my mind I had thought cargo was running each test as a separate process - obviously not the case given the naming of the environment variables to change the behavior.

There is another work around, for each thread if you call the xmlInitParser() method it works too:

    #[test]
    fn test_a() {
        unsafe {
            libxml::bindings::xmlInitParser();
        }
        do_xml_things(
            SCHEMA,
            "<note><to>Anyone</to><from>Me</from><heading /><body /></note>",
        );
    }

In my use case I will consider using this call in the module the XML is being processed in as it will avoid having to make changes to the CI system.

dginev commented 1 year ago

I recently encountered the deadlock and confirmed the resolution by @o087D works for me as well.

I suspect we can use a small PR that adds to parser.rs:

use std::sync::Once;
static INIT_LIBXML_PARSER: Once = Once::new();

// then in Parser::default() and Parser::default_html()
INIT_LIBXML_PARSER.call_once(|| unsafe {
  libxml::bindings::xmlInitParser();
});

which could successfully guard every cargo test for downstream libraries depending on the wrapper. Worth trying out?