pcapriotti / optparse-applicative

Applicative option parser
BSD 3-Clause "New" or "Revised" License
913 stars 116 forks source link

Add Semigroup/Monoid to Parser #463

Open pbrisbin opened 1 year ago

pbrisbin commented 1 year ago

:wave: Hi there-

I'm wondering if you would accept a PR to add:

instance Semigroup a => Semigroup (Parser a) where
  (<>) = liftA2 (<>)

instance Monoid a => Monoid (Parser a) where
  mempty = pure mempty

Apologies if this has been discussed elsewhere, I did a light search and couldn't find anything.

Gabriella Gonzalez has a good general justification for why this is useful. My own concrete use cases usually revolve around building up a settings modifier by option flags (vs parsing the settings structure itself).

For example, supporting --log-level=<level> or --debug neatly by turning both into an Endo LogSettings modification:

data LogSettings -- ...

setLogLevel :: LogLevel -> LogSettings -> LogSettings
setLogLevel = undefined

optionsParser :: Parser (Endo LogSettings)
optionsParser = mconcat <$> sequenceA
  [ flag mempty (Endo setLogLevelDebug) (long "debug")
  , Endo . setLogLevel <$> strOption (long "log-level")
  , ...
  ]

With the above instance, this is somewhat simpler,


- optionsParser = mconcat <$> sequenceA
+ optionsParser = mconcat
   [ flag mempty (Endo setLogLevelDebug) (long "debug")

Or in this case I'm adding a --{language} for every value in a Language enumeration.

data Language = ...

langOptionParser :: Language -> Parser (Endo SomeSettings)
langOptionParser language =
  flag mempty (Endo $ adjustSomeSettingsForLanguage language)
    (  long  (showLanguage language)
    <> help ("Run for the  " <> showLanguage language <> " language")
    )

data Options = Options
  { oFoo :: Foo
  , oBar :: Bar
  , oSettings :: Endo SomeSettings
  }

optionsParser :: Parser Options
optionsParser = Options
  <$> ...
  <*> ...
  <*> mconcat (sequenceA $ map langOptionParser [minBound..maxBound])

This one could be,

   <*> ...
-  <*> mconcat (sequenceA $ map langOptionParser [minBound..maxBound])
+  <*> foldMap langOptionParser [minBound..maxBound]

In general, I'm indeed finding being able to fold, m/sconcat or <> any Parser of a Semigroup or Monoid value pretty useful, as Gabriella indicated I would.

pcapriotti commented 1 year ago

Given that the Alternative instance also makes Parser a into a monoid (for general a, even), I'm not really convinced the ability to remove a single function call for the user is worth the potential confusion. But I don't have a strong opinion.

pbrisbin commented 1 year ago

Given that the Alternative instance also makes Parser a into a monoid

Hmm, this might be a bit above my head, do you mind elaborating? I get that aParser <|> bParser is a thing you can do to get some behavior (this-or-that), but how does it "turn [it] into a monoid"?

Gabriella also discusses this a bit:

You sometimes don’t want to implement the suggested Semigroup and Monoid instances when other law-abiding instances are possible. For example, sometimes the Applicative type constructor permits a different Semigroup and Monoid instance.

The classic example is lists, where the Semigroup / Monoid instances behave like list concatenation. Also, most of the exceptions that fall in this category are list-like, in the sense that they use the Semigroup / Monoid instances to model some sort of element-agnostic concatenation.

I view these “non-lifted” Monoid instances as a missed opportunity, because these same type constructors will typically also implement the exact same behavior for their Alternative instance, too, like this:

instance Alternative SomeListLikeType where
    empty = mempty

    (<|>) = (<>)

… which means that you have two instances doing the exact same thing

This seems to be specifically not that, as the proposed (<>) and (<|>) indeed have different behavior (this-and-that vs this-or-that).

worth the potential confusion

I'm not sure what confusion you mean exactly. Maybe I've internalized something that's unusual, but seeing (<>) as this-and-that vs (<|>) as this-or-that seems very intuitive to me.

pcapriotti commented 1 year ago

I didn't want to express anything particularly deep, just that (<|>) also determines a possible Monoid instance. So if I'm faced with an Alternative functor f applied to a monoid a, and the type f a happens to have a monoid instance, how am I supposed to know if this instance behaves like the Applicative or the Alternative instance? This is the confusion I was referring to.

pcapriotti commented 1 year ago

As a data point, in some other library I wrote I have a functor F which is morally always Alternative, but only Applicative in a special case. Unfortunately, this sort of thing cannot be expressed cleanly given the usual definition of those type classes, since Alternative is a subclass of Applicative. To resolve the issue, I defined an unconditional Monoid instance, and a conditional Alternative instance which behaves the same. So in that case it's more natural (and basically unavoidable) to identify the Monoid instance with the Alternative one, and not the Applicative.

pbrisbin commented 1 year ago

I didn't want to express anything particularly deep

Yup, just want to make sure I'm not missing something. (I also just find all this stuff fascinating, so thank you for engaging).

if I'm faced with an Alternative functor f applied to a monoid a, and the type f a happens to have a monoid instance, how am I supposed to know if this instance behaves like the Applicative or the Alternative instance?

I think you're saying that because,

instance Semigroup (Parser a) where
  (<>) = (<|>)

instance Semigroup a => Semigroup (Parser a) where
  (<>) = liftA2

Are both reasonable things to do, it presents confusion to the user which the library happens to be doing.

To that I would counter two points:

  1. The (<>) = (<|>) definition is not actually "reasonable". If/when the opinions expressed in Gabriella's post gain traction, this would be an anti-pattern. You would either not define Semigroup (since users can just use <|> and asum for the exact same behavior) or you would define a differently-behaving Semigroup because it is differently-behaving. A user can infer from the fact that there even are both instances, that they must have different behavior.
  2. The docs will show the Semigroup a requirement or not on the Semigroup (Parser a) docs, also indicating to the user which behavior the Semigroup instance is doing

I know you said you don't have a strong opinion, and neither do I (believe it or not). As you mentioned, it doesn't clean up too much. So I won't take up any more of your time. Feel free to close, or give me the go-ahead to make the PR -- up to you.

HuwCampbell commented 1 year ago

G'day,

I thought I'd chime in. I'm generally of the belief that instances should really only exist if there's one canonical / possible version of them. I'm also pretty skeptical of using foldMap in places where there is a monad or applicative involved because one has to think hard about what that effect does.

For example:

The other day I wrote somethng like this code, actually thinking about the post above:

let
  melt :: a -> Maybe [a]
  melt = _

  apcat = liftA2 (<>)
in
  foldMap melt as `apcat` pure [a] `apcat` foldMap melt bs

Here, Nothing represets some failure / leave it alone.

The idea was for a list of as and bs, if they can all be melted I can build up the final list. The problem here was that the Monoid instance for Maybe [a] didn't do what I needed. It lifts a semigroup to a monoid and mempty is Nothing, instead of Just []. This meant I would get Nothing if either list was empty, even though I really did want the results from the rest.

There are two functions the scala cats library which would do the trick flatTraverse, which translated is:

flatTraverse
  :: (Monad m, Traversable m, Applicative f)
  => (a -> f (m b)) -> m a -> f (m b)
flatTraverse f xs
 = join <$> traverse f xs

and foldMapM (this version from RIO)

foldMapM
  :: (Monad m, Monoid w, Foldable t)
  => (a -> m w)
  -> t a
  -> m w
foldMapM f = foldlM
  (\acc a -> do
    w <- f a
    return $! mappend acc w)
  mempty

Obviously the Monad constraint can be relaxed to Applicative if one is happy with less efficiency in many cases

foldMapA
  :: (Monoid b, Traversable t, Applicative f) =>
     (a -> f b) -> t a -> f b
foldMapA f as = fold <$> traverse f as 

I'm not really trying to convince you to use these functions everywhere, but I do believe that sometimes we can make things a bit too polymorphic, which can cause bugs like the one I had above if there aren't proper laws in place or there's more than one way to write the instance.

chris-martin commented 1 year ago

Would you accept a documentation PR -- to the Options.Applicative module documentation, the readme tutorial, or both -- adding an example to demonstrate the use of fmap fold . sequenceA to combine parsers?