daypack-dev / timere

OCaml date time handling and reasoning suite
MIT License
68 stars 7 forks source link

Angstrom-based parser causes stack overflow in JavaScript output #67

Closed zbaylin closed 1 year ago

zbaylin commented 1 year ago

This was pretty hard to debug because any JS executable that includes timere or timedesc immediately crashes. To replicate, simply create an empty OCaml file with

 (libraries timedesc)
 (modes js)

Then running node on the generated file with a large stacktrace limit and source maps enabled:

node --stack-trace-limit=9999999999999999 _build/default/src/timere_empty.bc.js

This produces a (very) long stacktrace with this at the bottom:

    ...
    at caml_call3 (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:32762:28)
    at half_compressed_of_string (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35666:17)
    at half_compressed_of_string_exn (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35670:17)
    at /Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:35687:25
    at Object.<anonymous> (/Users/zbaylin/Development/timere-js-test/_build/default/src/timere_empty.bc.js:38071:3)

half_compressed_string lives here: https://github.com/daypack-dev/timere/blob/f8e2fd5e1d6fa415a4741da8cf311f8c4dbca921/timedesc/time_zone.ml#L651, which seems to be the culprit.

I assume this has to do with the new Angstrom-based parser, but I haven't gone into the semantics to figure out why.

darrenldl commented 1 year ago

Oh huh, this is curious - my recollection is browser runs fine with it, though maybe that is a faulty recollection as well.

@glennsl were you using JS version of timedesc? Do you have any similar experience?

darrenldl commented 1 year ago

I'll try adding more commit (or whatever that stops Angstrom from backtracking) meanwhile...

glennsl commented 1 year ago

@glennsl were you using JS version of timedesc? Do you have any similar experience?

I can't recall having any technical issues at all! But the project I was using it for unfortunately collapsed (for entirely unrelated reasons!) so I haven't tried it after you overhauled the dependencies.

darrenldl commented 1 year ago

so I haven't tried it after you overhauled the dependencies.

Ah, oh well.


I tried adding some commits to no avail, but found out that the number passed to Angstrom.count matters a lot:

      let half_compressed : string M.t Angstrom.t =
        BE.any_uint16 >>=
        (fun table_count ->
           count table_count (commit *> half_compressed_name_and_table) >>|
           (fun l ->
              l
              |> List.to_seq
              |> M.of_seq
           )
        )

If I swap table_count with 100 it's fine, but crashes at 200 - the total number of tables is in the 200's I think.

So I suspect count might be the culprit here, though I am still not sure what the proper fix should be...

darrenldl commented 1 year ago

Okay, someone made the presumably same discovery: https://github.com/inhabitedtype/angstrom/issues/221

darrenldl commented 1 year ago

Using the tail-recursive version mentioned in the issue doesn't seem to help - probably failed to compile that to loop during JS compilation.

Looks like I'll have to hand roll a parser for this.

darrenldl commented 1 year ago

@zbaylin I pushed a fix and tried it with nodejs and seems to be working now - can you try pinning to main branch and see if the issue is gone (in both test setup and the actual code)?

Thanks for investigating btw!

darrenldl commented 1 year ago

Closing this as the mentioned fix has been pushed to Timedesc 0.9.1 and seems to side step the issue.