daypack-dev / timere

OCaml date time handling and reasoning suite
MIT License
68 stars 7 forks source link

A novel fix to the tzdata issue #17

Closed Drup closed 3 years ago

Drup commented 3 years ago

Behold.

Fix #16 I guess. At least it works now.

It would probably be good to extend CI a little bit.

gasche commented 3 years ago

I managed to force @Drup into explaining what's going on in this PR, which may be useful for our dear readers of the future:

After several experiments, I concluded that the sheer number of constant sub-expression is the real problem. Flambda lifts everything to toplevel and has a traversal on toplevel expression that is not tail-rec, and crashes.

I decided upon a fairly radical solution: we build a store of the data in advance (in sexp), which we distribute with the sources. At build time, we load the data, put it into the ocaml data-structure we want in the end, and Marshall that in a new ML file. It works fine, it's more compact and is faster to load.

(At build time, we create a module whose only content is let s = "...."(containing the marshalled data), and we link it in the library.)

(P.S. This stuff should also have gone into a commit message, but this ship has sailed.)

darrenldl commented 3 years ago

Perhaps could go into a OCaml knowledge base/wiki somewhere

gasche commented 3 years ago

"Don't generate <some more professional synonym of "shitloads"> of code except if you are willing to take some time to placate the compiler." :-)

Drup commented 3 years ago

"Don't generate <some more professional synonym of "shitloads"> of code except if you are willing to take some time to placate the compiler." :-)

Well, another interpretation is that there is no proper solution to define constants of arbitrary data-type in the language, so we have to rely on various hacks to go around that. It turns around the dirty solution, consisting of Marshalling the whole structure in a string and demarshall it at initialization of the program, is better supported by the compiler than the clean solution of emitting the code to build the constant.

It's not the first time the problem has come up (and it's even worse for metaocaml).

gasche commented 3 years ago

Sure, but that's an accurate description of the current state of things, and there are good reasons why things are not better (... it's a lot of work to fix). Are you volunteering to review my TRMC PR?

Drup commented 3 years ago

Are you volunteering to review my TRMC PR?

The link between the two is a bit tenuous but ... don't you already have 3 reviewers for that one ? Isn't that enough ? I though the lack of reviewer was not the limiting factor for your PR.

gasche commented 3 years ago

The PR will make it sensibly easier to make some parts of the compiler tail-recursive. It does not solve the issue completely, we need work on readable CPS notations as well. But mentioning it specifically was also a way to point out at the fact that few people invest their time in solving the problem: it was submitted months or years ago (depending on which version we use as a reference) and encountered mostly silence.

Elarnon did an excellent review job but he is now one of the co-authors (and not an Approver). pchambart and let-def are rumored to review it, and pchambart is trying to pretend that he is in the process of reviewing. I agree that today it's less of a priority (to find a new reviewer) than it was three months ago.

Drup commented 3 years ago

I'm well placed to know how annoying it is to wait for reviewers, but I don't think I should get involved in that one, too many cook, etc.

FTR, this is the traversal that explodes: https://github.com/ocaml/ocaml/blob/trunk/middle_end/flambda/flambda_iterators.ml#L700-L792 It's not immediately obvious to me that TRMC would help, at least without giving up on sharing.

gasche commented 3 years ago

Yes, the after-the-fact sharing makes this function non-TRMC, so you need a proper CPS here.

While reading this code I came up with the following feature request for a low-level language feature as an extension point. I apologize to @darrenldl for the very out-of-place tangent that this thread is becoming, but let me dump it here nonetheless:

[%share old K(e1,e2,...)]
(* desugars into *)
match old with
| K(o1, o2) ->
  let n1 = [%try-share o1 e1] in
  let n2 = [%try-share o2 e2] in
  ...
  if o1 == n1 && o2 == n2 && .. then old
  else K(n1, n2, ...)

(* with [%try-share ...] defined by: *)

[%try-share old K(...)]
(* desugars into *)
[%share old K(...)]

[%try-share old e]
(* desugars into, when e is not a constructor application *)
e
Drup commented 3 years ago

It's not a bad idea, but the syntax need to be on a pattern/branch, to avoid re-matching. | [%share K(o1,o2,...)] -> ...... K (e1,e2,...) where the last bit must be the return position.

Problem is that you go from something that looks like it could be TRMC to something that isn't. I think I would just want PPXs/constructions to easily CPS-ify (or defunctionalize) instead.

gasche commented 3 years ago

Yes, you could write it like this:

match program with
| [%share P (Let_symbol (symbol, const, program'))] ->
  let const = match const with
  | [%share C (Set_of_closures set_of_closures)] ->
    let new_set_of_closures = map_constant_set_of_closures set_of_closures in  
    [%share C (Set_of_closures new_set_of_closures)]
  | _ -> const
  in
  let new_program' = loop program' in
  [%share P (Let_symbol (symbol, const, new_program'))]

This is orthogonal to CPS (but it blocks TRMC as it does not preserve tail-call-ness), you can write yourself:

let rec loop program k =
  match program with
  | ...
  | [%share P (Let_symbol (symbol, const, program'))] ->
    let const = match const with
    | [%share C (Set_of_closures set_of_closures)] ->
      let new_set_of_closures = map_constant_set_of_closures set_of_closures in  
      [%share C (Set_of_closures new_set_of_closures)]
    | _ -> const
    in
    let@ new_program' = loop program' in
    k [%share P (Let_symbol (symbol, const, new_program'))]