Stabilising template-haskell

TeofilC commented 1 year ago

I'd like to start a conversation about what steps we can take as a community to improve template-haskell backwards compatibility.

The interface of template-haskell is tightly coupled to Haskell's syntax. This means that constructors and fields need to be added often as the language expands. Reacting to these changes normally just requires adding a Nothing value at use-sites, but over the entire ecosystem over several releases this can add up to a lot of work.

My vague idea is to start publishing a package that exports a fixed template-haskell interface that isn't tied to a specific version of GHC. I think this can either happen by having minor versions of template-haskell support several GHCs or by creating a new package.

The main thing I'd like from this discussion is to figure out if there's some unforeseen blockers and to find other people who are keen to work on this idea. I think the key thing to get something like this working is having enough people to comfortably maintain the compatibility shim.

Ericson2314 commented 1 year ago

@bgamari has noted the first step is using record field names.

The second step is https://github.com/ghc-proposals/ghc-proposals/pull/529

TeofilC commented 1 year ago

@bgamari has noted the first step is using record field names.

I'm guessing that this would also imply that we have smart constructors that only take the compulsory fields and any optional fields are added via record update? Since otherwise this would only be helpful for consuming not producing splices.

The second step is https://github.com/ghc-proposals/ghc-proposals/pull/529

@Ericson2314 I'm excited about this proposal and I can see how it would make it easier to make template-haskell code backwards compatible. But it also seems like it would require a lot of rewriting of code that uses th to get the benefit. I wonder if we can keep backwards compatibility without using something like this.

TeofilC commented 1 year ago

I'm basically wondering if we could use PatternSynonyms to build a partial mapping between the two versions of the interface. The partiality shouldn't be too bad as this code would be running at compile time, so, it could be turned into compile time errors.

I'm not sure if this is feasible, since I'm assuming that writing such a mapping would be relatively easy (it might be hard or impossible). But, if it works then I think it would mean that downstream consumers wouldn't need to change anything to upgrade.

Ericson2314 commented 1 year ago

@TeofilC Record fields and synonyms very much go together. And their deficiencies as a solution for "migrations" are not at all TH-specific and so worthy of addressing too.

But it also seems like it would require a lot of rewriting of code that uses th to get the benefit. I wonder if we can keep backwards compatibility without using something like this.

I would not worry about this. Doing a one-time rewrite to avoid future breaking changes is absolutely worth it.

TeofilC commented 1 year ago

I would not worry about this. Doing a one-time rewrite to avoid future breaking changes is absolutely worth it.

Yeah I agree with you. It would definitely be worth it

goldfirere commented 1 year ago

I'd be worried about using pattern synonyms here, because of the struggle to get completeness checking, along with some compile-time performance trouble.

Instead, I've been idly thinking about some core-template-haskell library that keeps up with GHC but also exposes a bunch of classes like

class ExpLike exp where
  fromExp :: Exp -> exp
  toExp :: exp -> Exp

with one such class for each AST type. Then type-check let x = $blah in ... such that blah :: ExpLike exp => Q exp, and similar for quotes and splices.

Now, we can have a template-haskell library that defines an AST and conversions to the core-template-haskell. When core-template-haskell upgrades along with GHC, template-haskell would have to update the conversion functions, but not its external interface. I'm picturing a versioning scheme where we have something like template-haskell-2.16.9.6, which would have the interface of template-haskell-2.16 but work with GHC 9.6. When we release a new GHC, we would then need to release template-haskell-2.16.9.8 (and, optionally, template-haskell-2.17.9.8, if we want to expose any new AST) to update the conversion functions.

This approach adds some burden to GHC:

The type-checker and desugarer now have to deal with abstract ASTs defined with class-based conversions instead of the concrete AST they have now. This is a one-time change that might be a little fiddly, but not fundamentally hard.
Every time the AST changes, all supported versions of template-haskell would have to be updated. These updates would happen with some regularity, but they would be easy and formulaic. And we could have a policy of supporting only, say, 5 TH versions, so the amount of work is bounded.

I think these costs are reasonable, though, and may provide a nice way forward.

Ericson2314 commented 1 year ago

I'd be worried about using pattern synonyms here, because of the struggle to get completeness checking, along with some compile-time performance trouble.

These still feel like issues worth fixing in general to me, however. We want "regular user code" to also be able to update data definitions with minimal pain.

TeofilC commented 1 year ago

Every time the AST changes, all supported versions of template-haskell would have to be updated. These updates would happen with some regularity, but they would be easy and formulaic. And we could have a policy of supporting only, say, 5 TH versions, so the amount of work is bounded.

I think we might be able to minimise the work needed by just writing conversions between version N and N+1 and composing them somehow, eg, using some sort of fancy code generation. But that might be more trouble/complexity than it's worth

We could also minimise effort by only having a handful of "LTS" template-haskell versions

goldfirere commented 1 year ago

Yes, we could imagine chaining transformations, but I don't think much is saved by doing so, and it would be less performant. Yes, we would want to cap the number of LTS TH versions.

Incidentally, one side effect of this plan is that, I think, my core-template-haskell AST could just be GHC's AST. Doing it this way means that any user code that uses TH would have to link against GHC, which maybe is bad. But actually if the AST were in a separate package (a long-term goal of @Ericson2314 I think), then this becomes more feasible. In any case, this is a "nice to have", not a requirement at all of this design.

TeofilC commented 1 year ago

Could you link to some GHC issues for the issues you foresee with pattern synonyms @goldfirere ? I think it would be good to collect those here.

bgamari commented 1 year ago

I'm guessing that this would also imply that we have smart constructors that only take the compulsory fields and any optional fields are added via record update? Since otherwise this would only be helpful for consuming not producing splices.

Yes, I think we would want smart constructors as well. Moveover I think it would be quite reasonable to have a set of stable smart constructors which can construct only programs expressible in Haskell 2010. I would guess that these would satisfy a large fraction of TH usages and therefore eliminate much of the churn that it causes.

We could similarly expose pattern synonyms matching against Haskell 2010 constructs, although the match-completeness problem is rather thorny here.

TeofilC commented 1 year ago

I really like the idea of using Haskell2010 as a way to distinguish between core parts of the interface and extensions.

I think we can divide the usages of the API into two broad categories: consuming and producing. I think most of the consumption of TH data is handled well by th-abstraction. And I think a package that exports smart constructors modelled around Haskell2010 would fill the niche for stably producing ASTs nicely.

I'll try to explore this in the next couple of months.

EDIT: I wrote this before reading the minutes from the last meeting. Feels like we are all vaguely on the same page. I too think that maybe just using quotes more is the way to go, and I don't really understand why that's not done more (other than in boot packages)

telser commented 1 year ago

I really like the idea of using Haskell2010 as a way to distinguish between core parts of the interface and extensions.

I think we can divide the usages of the API into two broad categories: consuming and producing. I think most of the consumption of TH data is handled well by th-abstraction. And I think a package that exports smart constructors modelled around Haskell2010 would fill the niche for stably producing ASTs nicely.

I'll try to explore this in the next couple of months.

EDIT: I wrote this before reading the minutes from the last meeting. Feels like we are all vaguely on the same page. I too think that maybe just using quotes more is the way to go, and I don't really understand why that's not done more (other than in boot packages) I tend to be less informed on TH by nature of not using it in both my work and personal codebases. So perhaps I'm missing some context, but I think that attempting to have and promote a suite of tools that aid stability is likely the best route.

If we can observe a number of usages actually being handled by th-abstraction and/or th-compat then it seems as perhaps there is something that we can add to TH itself as a more stable interface.

The hypothetical package you propose seems as though it would only increase the surface area of what could ultimately be part of a "stable" interface to TH, correct?

TeofilC commented 1 year ago

I think you are completely right @telser that there's a risk of just increasing the surface area.

At the same time if we added a stable interface to the template-haskell package in GHC 9.8, for instance, that could only be accessed by users who are on GHC 9.8+. So, even if we modify the template-haskell package directly we still want to create a compat package for older versions. Compat packages also have the advantage that it's a lot easier to release new versions with new features, eg, I'm working on adding a feature to th-abstraction right now. Once this feature is done users of the library will benefit from it irrespective of their version of template-haskell and GHC and that would be a lot trickier to do if th-abstraction was merged into template-haskell.

Of course all these issues would disappear if template-haskell was decoupled from GHC but that would take a lot of work as well.

adamgundry commented 9 months ago

See also https://gitlab.haskell.org/ghc/ghc/-/issues/24021 which I raised independently of this discussion, but contains similar ideas. Thanks @TeofilC for pointing me here.

I think the crucial first step is introducing a package distinction between "internal definitions" and "external view of AST", so that only the "internal" package is tightly coupled to GHC while the "external" package can be modified independently. There's a tricky question of how we design the API for the "external" package, and consequently how easy it is for it to support multiple GHC versions, but having the distinction at all would be a good start.

haskellfoundation / stability

Stabilising template-haskell #16