Open brendanhay opened 2 years ago
I am against a solution which excludes GHC >=8.8 && <9, because I think the installed base is still really large, and I think there are still a bunch of lagging libraries that aren't GHC-9 ready. This makes me most in favour of option 4 right now., but I have a few quibbles:
data Completion = Missing | Complete
. Then Required Missing Int
reads to me like "required but missing Int
".f :: Type -> Type
in your records, and substituting Proxy
or Identity
. This may be a bit confusing for readers unfamiliar with the pattern, and it admits silly things like f ~ IO
. Possible upside: it allows the records to have lifted Functor
etc instances like you find in rank2classes
/barbies
/conkin
, but I don't think that's especially useful.StorageBucketsDelete Missing
to a StorageBucketsDelete Complete
using common lens
idioms, unless it has at most one required-but-missing field. Consider newStorageBucketsDelete & #bucket .~ "foo"
: If we tried to set a second required-but-missing field using (&)
we'd run into trouble because the type between the two applications wouldn't fit either Missing
or Complete
. You might be able to get away with newFoo & (#bar .~ "baz") . (#baz .~ "quux")
, but I haven't tested this. Anyway, this isn't functionality we're losing, it just feels a bit inconsistent.TL;DR: If we want this in ASAP, I support the HKD approach, with tweaks (1) and (3) in this post. If we're willing to wait for more people to adopt GHC >=9.2, then I'd probably recommend record pattern synonyms, as part of another major release.
As mentioned in the opening - treat this as documentation of exploratory work. There is no intention to push a particular approach within any foreseeable release window(s).
- The name of either the type family or its index type isn't quite right to me [...]
The naming is taken directly from Chris Done's original gist, I've no particular attachment to any of it.
- You could avoid the type family completely [...]
A record which just contains defaults and no required parameters isn't intended to be some generally useful value, it's meaningless without being fully saturated, hence the lack Show
, Eq
, and the like.
Paramterising the record over f :: Type -> Type
means you now have to supply every required value wrapped in Identity
, which is what the closed type family is explicitly avoiding. I'm not arguing for/against, but pointing out why it's there - you either get naked required fields with a type family, or Identity
wrappers without.
I agree that lifted instances aren't useful - the only fields making use of f
within a record are the required fields.
- You cannot go from a
StorageBucketsDelete Missing
to aStorageBucketsDelete Complete
using common lens idioms
You're not supposed to - if that's ambiguous or a stumbling block then it's probably a bad design. Consider:
-- Old:
newStorageBucketsDelete "bucket" & #bucket .~ "foo"
-- New:
newStorageBucketsDelete { bucket = "bucket" } & #bucket .~ "foo"
The only part that has been changed here is to produce a value of StorageBucketsDelete
we've gone from a function :: Text -> StorageBucketsDelete
with associated parameter ordering woes, to using record syntax where we get named/freely ordered required parameters and punning.
The potential for ambiguity/misuse here seems to be that newStorageBucketsDelete { ... }
is an atomic statement and newStorageBucketsDelete
is not meaningful/useful until the required parameters are fully saturated.
An approach to supporting synonyms could be to retain the existing smart constructors and CPP 🥴 everything up the wazoo:
flag NoFieldSelectors
description: If the NoFieldSelectors extension should be enabled, when available.
default: True
manual: False
library
if flag(NoFieldSelectors) && impl(ghc >= 9.2.1)
cpp-options: -DNOFIELDSELECTORS
default-extensions: NoFieldSelectors
{-# LANGUAGE CPP #-}
newStorageBucketsDelete :: Text -> StorageBucketsDelete
newStorageBucketsDelete bucket =
StorageBucketsDelete
{ bucket
, ifMetagenerationMatch = Nothing
, ifMetagenerationNotMatch = Nothing
, provisionalUserProject = Nothing
, userProject = Nothing
, prettyPrint = True
}
#ifdef NOFIELDSELECTORS
{-# DEPRECATED newStorageBucketsDelete "Use the MkStorageBucketsDelete synonym instead" #-}
pattern MkStorageBucketsDeleteDefaults
{ bucket
} = StorageBucketsDelete
{ bucket = bucket
, ifMetagenerationMatch = Nothing
, ifMetagenerationNotMatch = Nothing
, provisionalUserProject = Nothing
, userProject = Nothing
, prettyPrint = True
}
#endif
As mentioned in the opening - treat this as documentation of exploratory work. There is no intention to push a particular approach within any foreseeable release window(s).
That's cool. I think if you're willing to explore and wait, then there's probably a good/ergonomic solution that becomes available once the 9.x record stuff matures.
- You cannot go from a
StorageBucketsDelete Missing
to aStorageBucketsDelete Complete
using common lens idiomsYou're not supposed to - if that's ambiguous or a stumbling block then it's probably a bad design. Consider:
-- Old: newStorageBucketsDelete "bucket" & #bucket .~ "foo" -- New: newStorageBucketsDelete { bucket = "bucket" } & #bucket .~ "foo"
The only part that has been changed here is to produce a value of
StorageBucketsDelete
we've gone from a function:: Text -> StorageBucketsDelete
with associated parameter ordering woes, to using record syntax where we get named/freely ordered required parameters and punning.
I understand this, but it feels a little weird if you try to do half of the setting with a record update (which you must), and then the rest with lens idioms. Would it be easier to do things HKD-style, but populate the record with a single update? That seems potentially fine - it's not often that you're going to want to incrementally build up a request using (<>~)
or whatever.
The potential for ambiguity/misuse here seems to be that
newStorageBucketsDelete { ... }
is an atomic statement andnewStorageBucketsDelete
is not meaningful/useful until the required parameters are fully saturated.
Indeed. I'd try it out without lens, and with a single record update that fills all the parameters at once. I suspect that will look reasonable.
My two cents on this, though I'll admit up front I'm not a current Amazonka user, merely a past one.
I really like #3, from a performance point of view - it seems like the option most likely to allow unpacked fields in the records, which can help a lot with memory usage and fragmentation. I've worked on projects using Amazonka where we were passing through many GB of data through Amazonka types (possibly on the order of hundreds of GB per day), and anything that could help reduce GC time and memory usage would be very valuable. I know it's to something we currently generate, but IMO unpacking fields, particularly in response types, could be a decent win for many applications.
IMO the CPP version above isn't too bad, as it means that we have a nice way solution for the future, with an appropriate backwards compatible solution for anyone stuck using very old compilers.
I also quite like the BazRequired
record version too (and honestly have no problem with the current function based one - as long as we never change the order of arguments between Get runs that should be fine... maybe that's not possible to guarantee though).
Here lie vague notes on code generation taken from my current work on gogol, focused on the UX around constructing large records.
TL;DR: Given the issues with record sizes (# of fields), required vs optional vs default parameters, and parameter reordering/significance, I've documented some thoughts focused on UX improvement when constructing/manipulating the large records found in both
amazonka
andgogol
. Feedback and alternative approaches/ideas welcome, add or links them in the comments below.Problem
Breaking down the current products emitted by the code generator, the fields of the product (parameters) fall into three cases:
Maybe
wrapper.The code generator emits the following record with each of these cases, as:
Haskell's standard record syntax is still the gold standard in terms of ergonomics and error messages when constructing new records, assuming you use the field names from
Foo
above:Which doesn't scale given the size of records that we're often dealing with in libraries like
amazonka-ec2
. To improve this, we generate a "smart constructor" (not really, but internal terminology) of the form:This helps, but introduces a new problem - as we ratchet this up to support larger and larger records with more and more
Int
fields, the order of parameters becomes significant:Which is more or less where
amazonka
andgogol
are today.Improvements/Solutions
(Some of these are sensible, some are (hopefully) clearly not.)
1. Named parameters
There are various implementations of named parameters available on Hackage, but pretty much all of them force the user to deviate away from plain record syntax.
One example, using the named package for
Bar
:And usage via
I'm personally unconvinced of the ergonomics in this specific case - the use of
OverloadedLabels
in combination with custom operators just seems like a lot of noise.2. Newtyped parameters
The generator could float out newtypes for every single field/Type, ie:
Note: you can't use only the field label since it's ambiguous across all possible {field, Type} permutations, so need the type, either in the datatype name
newtype AInt ..
, or polymorphically, as in the example above.I've attempted variations of this in the past and you get to choose either:
a :: Int
intoAInt
, buta :: Int
from recordFoo
has no semantic equivalence to fielda :: Int
from recordBar
, but you think they do, because .. types.Generate required specifications
The generator could similarly float out records containing only the required parameters:
This results in
2 * n
record datatypes, though.3. Record pattern synonyms
This is by far the nicest approach I've encountered so far as it allows using regular record syntax along with a sane way to specify default record values. But, it requires GHC >= 9.2 due to the use of
PatternSynonyms
having the following constraint:This appears to be lifted if
NoFieldSelectors
is enabled.See: Record Pattern Synonyms
Usage:
4. HKD
This is probably the easiest method to support right now for both the supported GHC versions + state of the code generator, the UX also seems decent. Taken from here.
All functions/instances then take 'StorageBucketsDelete Complete', forcing the bucket field to be specified:
You can also transparently use the smart constructor as expected:
5. Optional parameter-less records
If you were to move all optional and default parameters out of the records and into
HasField
or a custom typeclass, you get a minimum specification of the required fields you can wrangle via standard record syntax:But then where do you actually store the missing parameters?
One approach taken by kubernetes-client-core is to have a variety of plumbing upon which to hang instances and insert parameters into a backing store. In their case, it's unfortunately uni-directional as the act of setting a parameter also serialises it. You also move all field documentation onto the instances, further breaking up parameter help + discovery. You also lose any instances like
Generic
over the total possible set of parameters, instead it'd be restricted to only required parameters.Having used that library extensively in anger, I can't say I'm a fan of the usability - but there are ways to recover both getting, setting, and instances, which I'll outline in a half-baked manner here. Likely you'd use large-records and hasField with
OverloadedRecordDot
:It will improve compilation time significantly (due to
large-record
), but it completely blows all yourGeneric
instances and similar out of the water - you need the generator to emit stock-equivalent instances forShow
and the like, unless deemed unnecessary. It also relies onHasField
to be able to poke around insideRequest a
to get/set any of the required or optional parameters.You could alternatively use
Map Text Dynamic
for a safer alternative at the expense of suboptimal field access. It would be easier to recover the lost instances, though.)