ocaml-community / sedlex

An OCaml lexer generator for Unicode
MIT License
239 stars 43 forks source link

Add bytes API. #146

Closed toots closed 1 year ago

toots commented 1 year ago

This PR adds an API for tracking position in bytes. It should be opt-in and backward compatible.

Fixes: #139

toots commented 1 year ago

PR is confirmed working with liquidsoap and also confirmed that we were reporting bogus positions before, as we assumed the returned positions were in bytes not code points.

toots commented 1 year ago

@hhugo I think that this one is ready for merge!

hhugo commented 1 year ago

@hhugo I think that this one is ready for merge!

I still think comments are weird. See my comments https://github.com/ocaml-community/sedlex/pull/146/files#r1244060086

hhugo commented 1 year ago

Also see my reply in https://github.com/ocaml-community/sedlex/pull/146#discussion_r1244047393

toots commented 1 year ago

Also see my reply in #146 (comment)

Addressed!

toots commented 1 year ago

@hhugo I think that this one is ready for merge!

I still think comments are weird. See my comments https://github.com/ocaml-community/sedlex/pull/146/files#r1244060086

Addressed!

hhugo commented 1 year ago

LGTM

hhugo commented 1 year ago

Also fixes https://github.com/ocaml-community/sedlex/issues/96

hackwaly commented 1 year ago

So, can we use utf16 positions as well? Vscode lsp currently only support utf16 positions

toots commented 1 year ago

@hackwaly: The lexing_position API returns positions in code-point so utf16 position for utf16 input.