ocaml-ppx / ocaml-migrate-parsetree

DEPRECATED. See https://ocaml.org/changelog/2023-10-23-omp-deprecation. Convert OCaml parsetrees between different major versions
Other
87 stars 43 forks source link

normalize whitespace in output across OCaml versions #91

Open vogler opened 4 years ago

vogler commented 4 years ago

I ran into some differences in whitespace across OCaml versions when testing a ppx rewriter. For example OCaml 4.02-4.05 places no space after a comma in tuples but adds some trailing space in other places where OCaml >= 4.06 does not. See here: https://github.com/vogler/ppx_distr_guards/commit/2c940519e6c638be2d291a6cc2fab7d9ab79a854 I normalize the output with sed and only then compare.

It would be nicer if the output was the same across versions. I assume the pretty-printing is done by OCaml itself, but maybe omp could normalize it afterwards?

ghost commented 4 years ago

That seems like opening a can of worms to me. i.e. we'd need to keep up with arbitrary formatting changes if we wanted to provide a stable output. Indeed, it might not just be whitespaces, but also parentheses and other things.

It seems best to me to compare the files at the AST level, after having erased the locations. That's how [@@deriving_inline] is implemented and it works well.

vogler commented 4 years ago

It might be. In my test it was only whitespace. My assumption was that it should already be stable in non-whitespace output.

Your point is that the AST is normalized by omp? Is there an easy way to do AST comparison with omp? --dump-ast only outputs a binary AST. However, for these small ppx rewriter projects, it's nicer to have a test.expected.ml as tested documentation that's easy to read.

vogler commented 4 years ago

ocamlc -dparsetree -ppx './standalone.exe --as-ppx' test.ml prints the AST, but with locations.

ghost commented 4 years ago

It might be. In my test it was only whitespace. My assumption was that it should already be stable in non-whitespace output.

I'm not sure that's true for past versions of OCaml. But even if it was, it's not written or tested anywhere. So I'd rather we promise nothing in omp rather than make promises we can't keep.

Your point is that the AST is normalized by omp? Is there an easy way to do AST comparison with omp? --dump-ast only outputs a binary AST.

Nope. An AST is already normalised, otherwise it'd be a Concrete Syntax Tree. So essentially you let the parser do the normalisation. The locations are the only bit of information that capture the layout of the code. That's how [@@deriving_inline] is implemented in ppxlib BTW, by comparing the AST after removing the locations.

However, for these small ppx rewriter projects, it's nicer to have a test.expected.ml as tested documentation that's easy to read.

Agreed, and that could still work. For instance with:

(rule
 (action
  (progn
   (run ast-diff test.expected.ml test.ml)
   (diff? test.expected.ml test.expected.ml.corrected))))

Where ast-diff would do a loc-less comparison and write a file test.expected.ml.corrected if the two files don't match at the AST level.

vogler commented 4 years ago

Nope. An AST is already normalised, otherwise it'd be a Concrete Syntax Tree.

I meant the potential differences in the AST across OCaml versions which are normalized/translated to the selected version by omp.

Where ast-diff would do a loc-less comparison and write a file test.expected.ml.corrected if the two files don't match at the AST level.

However, then the diff will also show the difference in whitespace if ast-diff relies on OCaml's pretty-printing. I didn't find it in ppxlib, can you give a link to the code? Would it make sense to factor that out as a package or integrate it in omp?

I currently don't have time for it, but I'd like to test what other pretty-printing differences there actually are, maybe by diffing the OCaml testsuite.

Another idea: if omp can transform any AST to the newest OCaml version's one, it could copy its pretty-printing to offer current version pretty-printing for older versions. However, I don't know how much code would have to be pulled in for that.

ghost commented 4 years ago

I didn't find it in ppxlib, can you give a link to the code? Would it make sense to factor that out as a package or integrate it in omp?

It's in src/code_matcher.ml. If you want to factor it out, I suggest to make it a separate package. It seems to me that this feature doesn't fit in omp.