paurkedal / ppx_regexp

Matching Regular Expressions with OCaml Patterns
GNU Lesser General Public License v3.0
56 stars 3 forks source link

Not_found exception #8

Closed konstantin-korovin closed 5 years ago

konstantin-korovin commented 5 years ago

Thank you for a nice library.

I have an issue with unexpected Not_found exception.

= works as expected:

let parse_line i = 
 match%pcre i with 

   | {|(?<v1>(abc))[[:space:]]*(?<v2>(xyz))|} -> 
     Printf.printf "string: v1: %s v2: %s\n" v1 v2;

   | {|(?<f>[-+]?[[:digit:]]+.[[:digit:]]*)|} -> 
     Printf.printf "float: %f \n" (float_of_string f);

   | _ -> failwith "parse_line"

let () =  parse_line "abc xyz"

test:

string: v1: abc v2: xyz

==== If I swap first two match cases:

let parse_line i = 
 match%pcre i with 

   | {|(?<f>[-+]?[[:digit:]]+.[[:digit:]]*)|} -> 
     Printf.printf "float: %f \n" (float_of_string f);

   | {|(?<v1>(abc))[[:space:]]*(?<v2>(xyz))|} -> 
     Printf.printf "string: v1: %s v2: %s\n" v1 v2;

   | _ -> failwith "parse_line"

let () =  parse_line "abc xyz"

test:

Fatal error: exception Not_found

===

PS As a side question: is it possible to define a regular expression (or a string representing reg. expression) as an OCaml variable and use it in {| |}, in order to avoid copying definitions.

Thanks, Konstantin

Drup commented 5 years ago

@paurkedal It seems you have a bug in your handling of offsets! The exception comes from a misaligned Re.get. The bug doesn't happen in the tyre version.

@konstantin-korovin For your side question: ppx_tyre precisely solves that problem. You can look at the documentation. Also, you should use the block code syntax when you post code on github, like so : ```ocaml <the code> ```. I fixed your first message.

konstantin-korovin commented 5 years ago

@Drup many thanks for your quick reply and fixing my message. I'll try tyre.

paurkedal commented 5 years ago

The bug is due to top-level group elimination implemented when extracting bindings while missing while extracting the regular expression. I integrated your test and fixed it. Thanks!

paurkedal commented 5 years ago

I considered your suggestion, but I decided against implementing (?&...) in the %pcre, at least for now. The main reason has to do with scoping. The current PPX assumes all regular expressions are global constants, which makes it easy to compile them at program initialization time. This assumption can be dropped, but it will involve more complex PPX code to detect the optimal initialization point with respect to scoping.

So, I can also recommend looking into the %tyre, which may be better suited for the more complex use cases, anyway. It leaves the initialization point up to the user, and thus has no issue with scope-dependency.

I'll prepare a bugfix release tomorrow.

paurkedal commented 5 years ago

This was fixed in v0.4.2.