ocaml-ppx / ocaml-migrate-parsetree

DEPRECATED. See https://ocaml.org/changelog/2023-10-23-omp-deprecation. Convert OCaml parsetrees between different major versions
Other
87 stars 43 forks source link

The magic number checks of `Ast_4NN.register` appear to be broken #97

Closed gasche closed 4 years ago

gasche commented 4 years ago

I am trying to understand the cause of a segfault in ppx_deriving (reported by @kit-ty-kate) since we migrated to ppxlib, see https://github.com/ocaml-ppx/ppx_deriving/pull/210#issuecomment-632330283 . @thierry-martinez tracked down the issue to incompatible serialized ASTs (we seem to be calling input_value on an AST of some version, and getting a segfault when we try to use it as it was of another version), originating from a call to Ast_408.Ast_mapper.register. This suggests an issue in m-o-p, at least in terms of validation of magic numbers.

Indeed, the current source fo Ast_408 and all later versions have an Ast_mapper.apply_lazy module that performs magic number check using a Config module that appears to be the current compiler-libs config module (so it is checking against the current magic numbers, not 4.08's). There is a Config module defined later in the file, which provides the correct 4.08 magic numbers.

Magic number checking lines 3843-3847: https://github.com/ocaml-ppx/ocaml-migrate-parsetree/blob/18c00c9/src/ast_408.ml#L3843-L3847

Re-definition of Config lines 4043-4046: https://github.com/ocaml-ppx/ocaml-migrate-parsetree/blob/18c00c9/src/ast_408.ml#L4043-L4046

Questions:

(cc also @rgrinberg @jeremiedimino)

gasche commented 4 years ago

Thinking more about it, I am not completely sure of what is the intended semantics of Ast_4NN.register. Should it:

  1. register a 4NN-rewriter in such a way that it works on current serialized ASTs, or
  2. register a 4NN-rewriter in such a way that it works on serialized ASTs for OCaml 4NN?

Currently it is doing neither: it opens current serialized ASTs but calls a 4NN-rewriter without any conversion, leading to segfaults. My suggestion to change the magic-number check would implement (2). On the other hand, if you/we want (1), then we should (keep expecting the current magic numbers and) add a conversion pass on the rewriter.

I believe that the intended semantics should be documented clearly to help avoid issues. (Even without the current bug, if people use .register wrongly they get a runtime error, instead of a typing error as with most Migrate-parsetree version mistakes.)

(2) may be the more common need (it is rare to compile on a given version of OCaml and expect to produce a working ppx rewriter for an older version of OCaml), but then it is easily expressible by users today with just a single register function and the Convert module. It is also unclear to me that it can be implemented in the Ast_* module, because the conversion logic is not available yet at this point.

gasche commented 4 years ago

@thierry-martinez implemented two fixes that give either semantics: #98 and #99

It would be great to have guidance from someone more familiar with ocaml-migrate-parsetree (in particular someone who could review and make merge decisions!) on which way the tool should go.

ghost commented 4 years ago

I'm pretty sure we added these by accident since 4.08. The update process of omp is quite complicated, so I'm not surprised this happened.

gasche commented 4 years ago

You mention that you are planning to remove those erroneous register function, which I think is perfectly fine. Which other parts of the API do you think should be removed? (There more copy-paste from the compiler implementation in those files?). I think the safest would be to expose no more than Ast_407 and earlier modules, but I suppose there was a reason why more code was copied at the _408 level? (Maybe to deal with the updated Location or something?)

ghost commented 4 years ago

There seems to be some register functions in Docstrings as well. Though I don't expect this module to be used much, so I'm not planning to be over-zealous here.

I think the safest would be to expose no more than Ast_407 and earlier modules, but I suppose there was a reason why more code was copied at the _408 level? (Maybe to deal with the updated Location or something?)

The upgrade process is not mechanical. It is painfully manual and not very well formalised. For 4.08, it was done by a new person and the fact that the register functions have to be delete wasn't written anywhere. Nowadays I always glance at the diff between ast_N.ml and ast_(N+1).ml to check it looks sensible.

ghost commented 4 years ago

I believe this can be closed now.

ghost commented 4 years ago

BTW, one thing we mentioned in the past with @let-def is that we could narrow the scope of ocaml-migrate-parsetree to just the parsetree type migrations. Which is what it was originally supposed to be. It seems that ppxlib has been adopted quite widely adopted now, so it might be realistic to reduce the various Ast_XXX modules to just the parsetree types, get rid of the Driver module and maybe a couple of other things. That would make the ecosystem a bit simpler, and perhaps make our work on the new ppx library a bit easier as well.

let-def commented 4 years ago

Yes, I still think this is a worthwhile simplification. It should just be done carefully to avoid more breakage.

ghost commented 4 years ago

Indeed. I was looking at the dune-universe and there seems to be quite a few references to Migrate_parsetree.Driver still, so we'd have to ask quite a few people to port their code to ppxlib. Given that the plan is to eventually ask everyone to switch to ppx, we should probably just wait until ppx is ready.

ghost commented 4 years ago

We were just talking about this with @NathanReb and @avsm, and that seems to be a good problem to throw at the duniverse. More precisely, we could simplify ocaml-migrate-parsetree as one breaking 2.0.0 release and use duniverse to realistically migrate all packages currently relying on the omp driver to ppxlib at once.

We started playing with the idea of creating a duniverse with all the ppx projects out there still using the ocaml-migrate-parsetree driver: https://github.com/dune-universe/ppx-duniverse

ghost commented 4 years ago

After talking to a few people, we are actually going to do this. The overall plan is:

I'll announce this more publicly on discuss shortly.

ghost commented 4 years ago

@gasche @thierry-martinez when you have some time, I'm also interested to talk about the loader of ppx_deriving