Open aramallo opened 9 months ago
While doing more tests I found that adding a newline after the hash character avoids the failure which suggests the issue is with the parser confusing this with a LINE_COMMENT
?
So, this fails:
?[data] <- [[ ___"#"___]]
But this doesn't, yet it returns an empty string as a result which validates my assumption:
?[data] <- [[ ___"#\n"___]]
Using https://pest.rs I tried validating my assumption but according to the latest pest file even adding a newline should fail.
The case of an empty raw string
Now when adding #
Adding the newline does not change the tool result
Hope this helps. Unfortuntely I am not very good with Rust yet and not familiar with pest at all to find a solution to contribute.
So adding SOI to the LINE_COMMENT ruls solves the problem (but breaks LINE COMMENTS), which means we are on the right track.
LINE_COMMENT = _{ SOI ~ "#" ~ (!"\n" ~ ANY)* }
I've extracted the related rules into a fiddle that shows how this fails.
So I managed to fix the issue at the PEG level. The change consists in making the raw_string_inner
pest rule atomic so that we can avoid the LINE_COMMENT
having precedence over raw_string
when #
is present.
A fiddle here showing that it works.
I made the change in my fork. However, when I am pulling it from another project (my cozo binding for Erlang) , I still get the same error when running ?[data] <- [[___"#"___]]
I check Rust is compiling my fork and latest commit as shown below
Updating git repository `https://github.com/aramallo/cozo.git`
Updating git submodule `https://github.com/facebook/rocksdb.git`
Compiling cozorocks v0.1.7 (https://github.com/aramallo/cozo.git?branch=main#5d252699) <<<<<<<<
Compiling cozo v0.7.6 (https://github.com/aramallo/cozo.git?branch=main#5d252699) <<<<<<<<
Could it be the case that the pest file has not produced any change on the parser? I am new to RUST and pest
so not sure if I need to run something to generate the Rust parser and then commit that file or if pest is doing this when compiling automatically?
@zh217 Any ideas here?
Sorry, I'm not entirely sure, but that's included here: https://github.com/cozodb/cozo/blob/8b1b60cbf64f2b0ed2a14078cbd0c7838727df2a/cozo-core/src/parse/mod.rs#L39
#[derive(pest_derive::Parser)]
#[grammar = "cozoscript.pest"]
pub(crate) struct CozoScriptParser;
It's a derive macro, which gets automatically run during normal complication. It looks like pest_derive also accounts for external files changing (per https://github.com/pest-parser/pest/issues/789). So basically there should be no extra work required aside from changing that file.
And that log looks pretty clear, but you might be able to use cargo tree -i
to confirm which version of cozo are being pulled in in the dependent project.
(And thanks to this issue for teaching me that cozo supports comments! It doesn't appear to be documented when I looked)
The only solution in the meantime is to pass the values separately and not interpolate anything. But still, this needs fixing.
The presence of a single
#
character in any raw string will casue the query parser to fail.The following example fails with reason
The query parser has encountered unexpected input / end of input at 17..17
Just removing the
#
char makes it work.I am using Cozo Rust library version
0.7.5