kcsongor / generic-lens

Generically derive traversals, lenses, and prisms.
437 stars 53 forks source link

Add prisms to IsLabel instance #69

Closed dwincort closed 5 years ago

dwincort commented 5 years ago

This PR extends the optional IsLabel instance to provide prisms for constructors in addition to lenses for fields. I debated whether to push this PR to generic-lens-labels or perhaps to just create a new repo (generic-optics-labels?), but I felt that this was the best home for this code. Please let me know if you disagree.

Field Access

Instead of writing field @"foo" to get a lens for the field foo for some data type, one merely needs to write #foo. Furthermore, if the data type is part of a data instance, or otherwise has a HasField' instance but no HasField instance, the label #foo will still work.

Rather than using the HasField and HasField' classes, one can use the combined Field class, (which has the same structure as HasField) and the Field' type synonym. (Note that I believe this addresses Issue https://github.com/kcsongor/generic-lens/issues/64)

Constructor Access

Instead of writing _Ctor @"Foo" to get a prism for the variant Foo of some data type, one merely needs to write #_Foo. Note the need to include the underscore _ between the # and the constructor name; this is because GHC does not currently support overloaded labels starting with capital letters. For example, the data type

data Options = OptionA | OptionB | OptionC

will have prisms #_OptionA, #_OptionB, and #_OptionC.

Likewise, rather than using the AsConstructor and AsConstructor' classes, one can use the combined Constructor class, (which has the same structure as AsConstructor) and the Constructor' type synonym.

Notes

Incoherent Instances

This PR uses incoherent instances to combine, e.g., HasField and HasField' into a single class so that labels can work for any field lens. This is not exactly elegant, but I didn't find a better solution.

Orphan Instance

The IsLabel instance is now gigantic as it's covering all a -> b. This can cause conflicts with other libraries, but I think it's worth it for the convenience you get here. As one of the maintainers of the extensible records package https://github.com/target/row-types, I already have overlapping HasField and AsConstructor instances for an upcoming generic-lens-row-types package, and I think it will be more convenient to use the Data.Generics.Labels here and have my library play nicely with it than to try to accommodate multiple different IsLabel instances.

Label Comparison (and fields that start with underscores)

The IsLabel instance chooses between returning a prism vs returning a lens based on whether the first character in the symbol is an underscore. This is not very elegant, but it works, and it has some questionable repercussions. Notably, consider the following data type declaration:

data Foo = Foo
  { _Foo :: Int
  , _bar :: Int
  }

In this case, the label #_Foo would be for the prism based on the Foo constructor, meaning that there is no generic lens for the _Foo field. Furthermore, the label #_bar would also be interpreted as a prism and could not be used as a lens to the _bar field. Thus, this system does not mesh nicely with data types whose field names start with underscores.

One could modify the approach in this PR to force labels to start with an underscore followed by a capital letter; this lets the label #_bar point to the _bar field, but it does not help with the _Foo field. I chose not to do this because 1) I like the simplicity of an underscore always indicating a prism, and 2) with overlapping instances, one can make usable lowercase names for prisms.

Exposing This Module

Obviously, this work doesn't make a lot of sense without PR https://github.com/kcsongor/generic-lens/pull/68 also going through.

kcsongor commented 5 years ago

First of all, thanks for such a well-documented PR!

I always felt that one downside (other than the unfortunate orphan instance problem) of these labels was the arbitrary preference of lenses over the other optics, but this PR seems to solve that wrinkle in a rather nice way. The label comparison trick is quite clever, I didn't think something like this was possible!

The use of incoherent instances seems justified here, and it does seem sensible even outside the scope of overloaded labels.

Seeing that generic-lens-labels is not actively maintained anymore, I say let's just merge this feature into generic-lens -- there is an appropriate warning sign after all.

kcsongor commented 5 years ago

By the way, shouldn't the comparison be between _@ and _[? (looking at the ASCII table)

dwincort commented 5 years ago

By the way, shouldn't the comparison be between @ and [? (looking at the ASCII table)

That is a great question and something that I forgot to document (I updated the PR description to discuss it). I'll describe the fundamental issue and my struggle with it, and since you're the library maintainer, I'll leave the final decision up to you.

The question really boils down to: What are we defining to be the fundamental nature of a "prism" label as compared to a "lens" label? To highlight why this is a tough question, consider the following data type:

data Foo = Foo
  { _Foo :: Int
  , _bar :: Int
  }

Here, we have an obvious conflict because _Foo could be either a lens for the field or a prism for the constructor. But also, _bar is a little funny---should the leading _ indicate that it's a prism? Some people use a prepended _ for record fields (for instance, to work with the Lens library's makeLenses function), so in my first draft, I decided that _bar should be a lens.

Then, I went about adding generic-lens support to my row-types library. Row-types supports open variants in addition to open records, so I added a HasField instance for the records and an AsConstructor instance for the variants. After that, I realized that in my own code, I was using lowercase names for the variants, and so the labels were not working right for me. One option was to demand that users always use uppercase names for variants, but there are places in the row-types library that require the names for variants and records to be the same, so this didn't work. Of course, using uppercase names for the records runs into the problem that GHC doesn't support uppercase labels.

It was at this point that I changed the definition of a "prism" label to be anything that comes after an _. It incidentally means that _bar in the above example also won't work as a lens, but I was okay with that---with DuplicateRecordFields and these generic labels, there's no need to prepend an _ onto record fields anyway.

At this point, you understand the struggle, so as I said at the start, I'll leave the question up to you: Is an _ enough to classify a label as a prism, or must it be followed by a capital letter?

kcsongor commented 5 years ago

Quite subtle indeed!

Hm, I think I would prefer if _bar worked as a lens, mainly because I think that some large older code bases don’t have the luxury of renaming field names, as JSON derivation might depend on them too, for example. I wonder if we could pick something other than an underscore? Or would that break your use case? I do agree that a simple rule would be nice to have, and ultimately might be worth sacrificing the underscore fields if that’s the only way.

I’m actually a little confused about how this works out, namely the following constraint:

name' ~ AppendSymbol "_" name

The instance head mentions only name’, and name can’t be inferred from the rest of the context. Is GHC somehow aware if the injectivity of AppendSymbol? I will admit I’ve not used this type family much, but just looking at its signature, I wouldn’t expect this to work, but I’m probably missing something here. Do you have some working examples? (I’m also without a laptop for a few days, so can’t try it myself at the moment)

dwincort commented 5 years ago

I do have some working examples, but I don't have enough knowledge of how type families (and AppendSymbol specifically) work under the hood in GHC to explain exactly why it works.

data SumOfProducts =
    RecA { foo :: Int, valA :: String }
  | RecB { foo :: Int, valB :: Bool }
  | RecC { foo :: Int }
  deriving (Show, Eq, Generic)

testSumOfProducts =
     (val ^. #foo       ) == 3
  && (val & #foo +~ 10  ) == RecB 13 True
  && (val ^? #_RecB     ) == Just (3, True)
  && (val ^? #_RecB . _1) == Just 3
  && (val ^? #_RecC     ) == Nothing
  where val = RecB 3 True

Then, in GHCi, we can see that:

> testSumOfProducts
True

As for the label name issue, it's a little tough. Labels can only start with lowercase letters or an underscore, so there's no other symbol we can use. We could use unicode lowercase letters (p is for prism, so we could use ƥ or ꝕ, say) ... but this strikes me as not the best idea.

Now that I'm considering this further, maybe an underscore followed by a uppercase letter is the right idea, and if labels are ever extended to allow uppercase letters, we could easily just drop the demand for a leading underscore, and everything will work great. The only real problem for me is in my row-types library, but perhaps I can use the same AppendSymbol trick to do something "idiomatic" there. For instance, perhaps I can declare that variants should always have uppercase names prefixed with underscores (which makes the generic-prisms work right), and equality between variant names and record names automatically drops the underscore and does a case conversion (... I'll have to write that type family, but it sounds doable).

kcsongor commented 5 years ago

Having looked at the implementation in GHC, it seems like AppendSymbol does a bit more than its signature would suggest, and indeed the constraints can be solved backwards. I hadn't realised this until now!

I'm leaning towards only allowing uppercase letters for prism labels, as it's much simpler that way, and as you said, the case conversion logic is probably better placed in the row-types library. I'll add your example to the tests and merge this now. Thanks a lot again!