mirage / ocaml-rpc

Light library to deal with RPCs in OCaml
ISC License
95 stars 30 forks source link

Alternative to Internals.encode #133

Open lindig opened 5 years ago

lindig commented 5 years ago

This is just an idea: the implementation of escape rules for Xml (or other transports) could be also delegated to an efficient regexp engine specified by OCamlLex:

{
    (* short names for important modules *)
    module L = Lexing 
    module B = Buffer

let get      = L.lexeme

exception Error of string
let error fmt = Printf.kprintf (fun msg -> raise (Error msg)) fmt

}

rule escape b = parse
| '&'       { B.add_string b "&";  escape b lexbuf } 
| '"'       { B.add_string b """; escape b lexbuf } 
| '\''      { B.add_string b "'"; escape b lexbuf }
| '>'       { B.add_string b ">";   escape b lexbuf }
| '<'       { B.add_string b "&lt;";   escape b lexbuf }
| [^'&' '"' '\'' '>' '<']+ 
            { B.add_string b @@ get lexbuf
            ; escape b lexbuf
            }
| eof       { let x = B.contents b in B.clear b; x }
| _         { error "don't know how to quote: %s" (get lexbuf) }

{
    let escape str = escape (B.create 100) (L.from_string str)
}
mseri commented 4 years ago

This could be useful also in other places. Like the json parser, e.g. https://github.com/nojb/tinyjson/blob/master/lib/json.mll

On a separate note, I think markup.ml has the most comprehensive dictionary of entities for encode and decode: https://github.com/aantron/markup.ml/blob/master/src/entities.ml

lindig commented 4 years ago

This is nice example for the power of OCamlLex and how to effectively use the different states (or sub-scanners).