Closed dimfeld closed 3 years ago
I will take a look at it. However, I am without a computer until February, so I might take a while. Maybe @otaviopace can be quicker.
Otavio, what do you think about adding itertiools
?
No hurry, this is just for a personal project and I'm using my fork for now. Thanks!
But thanks for the issue
Thanks for the feedback. Made all the above requested changes and got rid of itertools completely!
Benchmark results:
Running target/release/deps/parse-fe183261c12e6e1f
Gnuplot not found, using plotters backend
parse time: [4.9507 us 4.9619 us 4.9758 us]
change: [+0.0064% +0.2933% +0.5524%] (p = 0.04 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) high mild
8 (8.00%) high severe
Running target/release/deps/serialize-704ae7e803b9c554
Gnuplot not found, using plotters backend
serde time: [679.15 ns 681.93 ns 684.63 ns]
change: [+0.7656% +1.2731% +1.7743%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) low mild
edn time: [2.6354 us 2.6375 us 2.6397 us]
change: [+1.3466% +1.6104% +1.8172%] (p = 0.00 < 0.05)
Performance has regressed.
Running target/release/deps/tagged_parse-7435533901f55659
Gnuplot not found, using plotters backend
tagged_parse time: [1.3659 us 1.3705 us 1.3759 us]
change: [+12.912% +13.537% +14.131%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) high mild
9 (9.00%) high severe
So two issues.
I asked my friend to run this test on the branch and the assert fails: [Update: he misread the test output, but it is a good test to have]
#[test]
fn parse_map_with_special_char_str1() {
let mut edn = "{ :a \"hello\n \r \t \\\"world\\\" with escaped \\ characters\" }".chars().enumerate();
assert_eq!(
parse(edn.next(), &mut edn).unwrap(),
Edn::Map(Map::new(
map! {":a".to_string() => Edn::Str("hello\n \r \t \"world\" with escaped \\ characters".to_string())}
))
);
}
Itertools solution seems to be way more performatic for parsing, from 33% up to 63%. The previous solution if you want to use/test it:
.fold_while((false, String::new()), |(last_was_escape, mut s), (_, c)| {
if last_was_escape {
// Supported escape characters, per https://github.com/edn-format/edn#strings
match c {
't' => s.push('\t'),
'r' => s.push('\r'),
'n' => s.push('\n'),
'\\' => s.push('\\'),
'\"' => s.push('\"'),
_ => {
// On an unsupported escape character, output both the backslash and the character
s.push('\\');
s.push(c);
}
};
FoldWhile::Continue((false, s))
} else if c == '\"' {
// Unescaped quote means we're done
FoldWhile::Done((false, s))
} else if c == '\\' {
FoldWhile::Continue((true, s))
} else {
s.push(c);
FoldWhile::Continue((false, s))
}
})
.into_inner()
.1;
With this I am DONE
merged via https://github.com/naomijub/edn-rs/pull/77
Hmm, interesting. I actually see slower performance with the itertools solution, especially on tagged_parse
which is almost completely string parsing. Anyway, thanks for your help in getting this merged!
I will take a deeper look at it. tagged_parse
didn't show much difference here
Crate published if you want to use it from crates io
Parsing failed with any string containing a
\"
because it ended the string early.