ocaml-community / sedlex

An OCaml lexer generator for Unicode
MIT License
239 stars 43 forks source link

Opam package not exposing all of Sedlexing module #85

Closed Arch199 closed 1 year ago

Arch199 commented 5 years ago

I installed sedlex with opam and was trying to get integration with Menhir working when I attempted to use the Sedlexing.with_tokenizer function. However, it came back as unbound and I noticed that the interface being exposed was just this:

module Sedlexing :
  sig
    type lexbuf
    exception InvalidCodepoint of int
    exception MalFormed
    val create : (int array -> int -> int -> int) -> lexbuf
    val from_gen : int Gen.t -> lexbuf
    val from_stream : int Base.Stream.t -> lexbuf
    val from_int_array : int array -> lexbuf
    val lexeme_start : lexbuf -> int
    val lexeme_end : lexbuf -> int
    val loc : lexbuf -> int * int
    val lexeme_length : lexbuf -> int
    val lexeme : lexbuf -> int array
    val lexeme_char : lexbuf -> int -> int
    val sub_lexeme : lexbuf -> int -> int -> int array
    val rollback : lexbuf -> unit
    val start : lexbuf -> unit
    val next : lexbuf -> int
    val mark : lexbuf -> int -> unit
    val backtrack : lexbuf -> int
    module Latin1 : sig ... end
    module Utf8 : sig ... end
    module Utf16 : sig ... end
  end

which is contrary to the sedlexing.mli file and doesn't seem to be documentated as such. (Though I did read something about some of the functions being considered part of the "internal interface".) However, when I cloned the repo and built from source, I had access to the full signature (including with_tokenizer, etc.), so this seems to be an issue with the opam package install.

pmetzger commented 5 years ago

Where are you getting that module signature for what you believe to be the "exposed interface"? I don't seem to be able to reproduce your problem. What version of the opam package are you installing, and how are you getting this "exposed interface" module signature?

Arch199 commented 5 years ago

I am running Ubuntu 18.04 on WSL with opam version 1.2.2 which installed sedlex 1.99.4. I got the output above by running #require "sedlex";; and then #show Sedlexing;; in utop.

toots commented 5 years ago

Are you sure of the version used by topfind? I have an API that looks correct here:

# #show Sedlexing;;
module Sedlexing :
  sig
    type lexbuf
    exception InvalidCodepoint of int
    exception MalFormed
    val create : (Uchar.t array -> int -> int -> int) -> lexbuf
    val set_position : lexbuf -> Lexing.position -> unit
    val set_filename : lexbuf -> string -> unit
    val from_gen : Uchar.t Gen.t -> lexbuf
    val from_stream : Uchar.t Stream.t -> lexbuf
    val from_int_array : int array -> lexbuf
    val from_uchar_array : Uchar.t array -> lexbuf
    val lexeme_start : lexbuf -> int
    val lexeme_end : lexbuf -> int
    val loc : lexbuf -> int * int
    val lexeme_length : lexbuf -> int
    val lexing_positions : lexbuf -> Lexing.position * Lexing.position
    val new_line : lexbuf -> unit
    val lexeme : lexbuf -> Uchar.t array
    val lexeme_char : lexbuf -> int -> Uchar.t
    val sub_lexeme : lexbuf -> int -> int -> Uchar.t array
    val rollback : lexbuf -> unit
    val start : lexbuf -> unit
    val next : lexbuf -> Uchar.t option
    val mark : lexbuf -> int -> unit
    val backtrack : lexbuf -> int
    val with_tokenizer :
      (lexbuf -> 'token) ->
      lexbuf -> unit -> 'token * Lexing.position * Lexing.position
    module Latin1 : sig ... end
    module Utf8 : sig ... end
    module Utf16 : sig ... end
  end
Arch199 commented 5 years ago

As far as I can tell, it's using the same version e.g. opam show sedlex and #list;; in utop both agree that the version is 1.99.4. Is there some other way to check the version topfind is using?

I also tried uninstalling sedlex via opam and confirmed that it disappeared from my package list and was inaccessible. Reinstalling it produced the same result as before, so it doesn't seem like there's another version hiding on my computer somewhere. I also tried installing sedlex 1.99.3, which was the only other version compatible with my OCaml version of 4.05.0, to the same result.

Not sure if this helps, but my .ocamlinit file looks like this:

(* Added by OPAM. *)
let () =
  try Topdirs.dir_directory (Sys.getenv "OCAML_TOPLEVEL_PATH")
  with Not_found -> ()
;;

#use "topfind";;
#thread;;
#require "core.top";;
#require "core.syntax";;

#require "ppx_jane";;
#require "ppx_deriving";;
#require "ppx_deriving.show";;

I tried removing the statements requiring the Jane Street libraries to see if that might be the issue, but it made no difference. One strange thing is that not only is the API missing some functions, but their signatures are different e.g. with int instead of Uchar.t (could Uchar.t happen to be int under the hood and this is getting substituted in somehow?)

pmetzger commented 5 years ago

int is the old signature from before Uchar.t existed. I can't reproduce this either. I think that somehow you're getting a weird and old version of the library.

hhugo commented 1 year ago

should we close ?

toots commented 1 year ago

Yes. There has been some cases in the past of projects shipping their own .mli for Sedlex.