nix-community / rnix-parser

A Nix parser written in Rust [maintainer=@oberblastmeister]
MIT License
365 stars 44 forks source link

`StrPart` parser unescapes double-single-quote strings containing double quotes incorrectly #69

Closed fogti closed 2 years ago

fogti commented 2 years ago

Describe the bug

Fails to parse a segment of nixpkgs/lib/systems/parse.nix containing a double-single-quote string which contains double quotes.

Code Snippet to reproduce

https://github.com/NixOS/nixpkgs/blob/698d8759c90f2e811b1044b999b91a0e963772bb/lib/systems/parse.nix#L343-L345

''
  The "android" ABI is not for 32-bit ARM. Use "androideabi" instead.
''

Test case for value.rs

diff --git a/src/value.rs b/src/value.rs
index 05d5674..4b093f2 100644
--- a/src/value.rs
+++ b/src/value.rs
@@ -330,6 +330,18 @@ mod tests {
                 ))
             ]
         );
+
+        assert_eq!(
+            string_parts(&string_node(
+                // found in https://github.com/NixOS/nixpkgs/blob/698d8759c90f2e811b1044b999b91a0e963772bb/lib/systems/parse.nix#L343-L345
+                r#"The "android" ABI is not for 32-bit ARM. Use "androideabi" instead."#
+            )),
+            vec![
+                StrPart::Literal(String::from(
+                    "The \"android\" ABI is not for 32-bit ARM. Use \"androideabi\" instead.\n"
+                ))
+            ]
+        );
     }
     #[test]
     fn values() {

Current behavoir

Stops parsing the string after "The ". see also: https://gist.github.com/zseri/bb8ed4f47123960f4231d6fa928599ac#file-parse-js-L1093

Expected behavior

The string should parse correctly

Additional context

rnix version: 2e948656f3f5a3981729c21a5ff7cceaff3f8d9a

Ma27 commented 2 years ago

I'd expect the test-case you posted to fail since there are no quotes for Nix to parse after the r#"that seems to be the rust-level quotation.

Also, an expression such as the one you posted -

''
  The "android" ABI is not for 32-bit ARM. Use "androideabi" instead.
''

seems to be parsed perfectly fine:

$ cargo run --example dump-ast foo
    Finished dev [unoptimized + debuginfo] target(s) in 0.03s
     Running `target/debug/examples/dump-ast foo`
NODE_ROOT 0..76 {
  NODE_STRING 0..75 {
    TOKEN_STRING_START("\'\'") 0..2
    TOKEN_STRING_CONTENT("\n  The \"android\" ABI is not for 32-bit ARM. Use \"androideabi\" instead.\n") 2..73
    TOKEN_STRING_END("\'\'") 73..75
  }
  TOKEN_WHITESPACE("\n") 75..76
}

Am I missing something?

fogti commented 2 years ago

The StrPart API in value.rs doesn't correctly reconstruct the single contained StrPart::Literal string. Keep in mind that the test case I posted uses the string_node function which adds the double-single-quotes.

Probably at fault: https://github.com/nix-community/rnix-parser/blob/2e948656f3f5a3981729c21a5ff7cceaff3f8d9a/src/value.rs#L267-L275 https://github.com/nix-community/rnix-parser/blob/2e948656f3f5a3981729c21a5ff7cceaff3f8d9a/src/value.rs#L47

Ma27 commented 2 years ago

Ohh I see. I'll try to take a look soonish.