Lysxia / generic-data

Generic data types in Haskell, utilities for GHC.Generics
https://hackage.haskell.org/package/generic-data
MIT License
44 stars 9 forks source link

gshowsPrec doesn't match `deriving Show` behavior in certain cases #30

Closed RyanGlScott closed 4 years ago

RyanGlScott commented 4 years ago

While comparing generic-data's gshowsPrec to how deriving Show works, I noticed that there were a handful of cases where the two have differing behavior. I've distilled a small test suite to demonstrate these discrepancies:

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
module Main where

import Control.Monad
import Generic.Data
import GHC.Generics

data T1 = MkT1 { (##) :: () }
  deriving (Generic, Show)

data T2 = (:!:) () ()
  deriving (Generic, Show)

data T3 = MkT3 { unT3 :: () }
  deriving (Generic, Show)

testGshowsPrec :: (Generic a, GShow0 (Rep a), Show a)
               => Int -> a -> IO ()
testGshowsPrec p x = do
  let sp  = showsPrec p x ""
      gsp = gshowsPrec p x ""
  unless (sp == gsp) $ do
    putStrLn "Mismatch!"
    putStrLn $ "  showsPrec: "  ++ showsPrec p x ""
    putStrLn $ "  gshowsPrec: " ++ gshowsPrec p x ""

main :: IO ()
main = do
  testGshowsPrec  0 (MkT1 { (##) = () })
  testGshowsPrec  0 (() :!: ())
  testGshowsPrec 11 (MkT3 { unT3 = () })

When run, this produces:

Mismatch!
  showsPrec: MkT1 {(##) = ()}
  gshowsPrec: MkT1 {## = ()}
Mismatch!
  showsPrec: (:!:) () ()
  gshowsPrec: :!: () ()
Mismatch!
  showsPrec: (MkT3 {unT3 = ()})
  gshowsPrec: MkT3 {unT3 = ()}
RyanGlScott commented 4 years ago

After a quick search, here are my findings on why each of these bugs happen:

  1. MkT1's bug is actually a bug in the show-combinators library. Specifically, a bug in the showField function:

    -- | Show a single record field: a field name and a value separated by @\'=\'@.
    showField :: String -> PrecShowS -> ShowFields
    showField field showX =
     showString field . showString " = " . showX 0

    This does not parenthesize field if it is a symbolic name like ##. It should check for symbolic names using a function like starsVarSym.

  2. MkT2's bug is essentially the same underlying issue as in MkT1, but at a constructor level. The culprits are these bits of code:

    https://github.com/Lysxia/generic-data/blob/cd19cb5e4efb32352546929bc038f5ca3b846f56/src/Generic/Data/Internal/Show.hs#L88-L90

    https://github.com/Lysxia/generic-data/blob/cd19cb5e4efb32352546929bc038f5ca3b846f56/src/Generic/Data/Internal/Show.hs#L96-L98

    These incorrectly assume that prefix names never need to be parenthesized and that infix names always need to be parenthesized. In reality, prefix names must be parenthesized if they are symbolic (e.g., the :!: in (:!:) () ()), and infix names should be surrounding in backticks, not parentheses, if they are alphanumeric (e.g., the MkT2 in () `MkT2` ()).

    Fortunately, checking if a data constructor name is symbolic is quite easy: just check if it begins with :.

  3. MkT3's bug is another bug in show-combinators. This time, the bug is in the showRecord function:

    -- | Show a record. The first argument is the constructor name.
    -- The second represents the set of record fields.
    showRecord :: String -> ShowFields -> PrecShowS
    showRecord con showFields _ =
     showString con . showSpace . showChar '{' . showFields . showChar '}'

    This is missing a call to showParen (d > appPrec), where d is currently a wildcard pattern in showRecord's definition.

RyanGlScott commented 4 years ago

I sketched out a quick solution for the bugs above. Here is the patch you need to make to show-combinators:

diff --git a/src/Text/Show/Combinators.hs b/src/Text/Show/Combinators.hs
index bbb3993..462f9ee 100644
--- a/src/Text/Show/Combinators.hs
+++ b/src/Text/Show/Combinators.hs
@@ -48,6 +48,7 @@ module Text.Show.Combinators
   , (&|)
   ) where

+import Data.Char
 import Text.Show

 -- | Type of strings representing expressions, parameterized by the surrounding
@@ -184,13 +185,13 @@ type ShowFields = ShowS
 -- | Show a record. The first argument is the constructor name.
 -- The second represents the set of record fields.
 showRecord :: String -> ShowFields -> PrecShowS
-showRecord con showFields _ =
-  showString con . showSpace . showChar '{' . showFields . showChar '}'
+showRecord con showFields d = showParen (d > appPrec)
+  (showString con . showSpace . showChar '{' . showFields . showChar '}')

 -- | Show a single record field: a field name and a value separated by @\'=\'@.
 showField :: String -> PrecShowS -> ShowFields
 showField field showX =
-  showString field . showString " = " . showX 0
+  showParen (isSymVar field) (showString field) . showString " = " . showX 0

 infixr 8 .=.

@@ -226,3 +227,14 @@ showSpace = (' ' :)
 appPrec, appPrec1 :: Int
 appPrec = 10
 appPrec1 = 11
+
+isSymVar :: String -> Bool
+isSymVar ""    = False
+isSymVar (c:_) = startsVarSym c
+
+-- Taken from the ghc-boot-th package to avoid incurring an extra dependency
+startsVarSym :: Char -> Bool
+startsVarSym c = startsVarSymASCII c || (ord c > 0x7f && isSymbol c) -- Infix Ids
+
+startsVarSymASCII :: Char -> Bool
+startsVarSymASCII c = c `elem` "!#$%&*+./<=>?@\\^|~-"

And here is the patch for generic-data:

diff --git a/src/Generic/Data/Internal/Show.hs b/src/Generic/Data/Internal/Show.hs
index 9e39db8..04a2769 100644
--- a/src/Generic/Data/Internal/Show.hs
+++ b/src/Generic/Data/Internal/Show.hs
@@ -82,22 +82,34 @@ class GShowC p c f where
 instance GShowFields p f => GShowC p ('MetaCons s y 'False) f where
   gPrecShowsC p name fixity (M1 x)
     | Infix _ fy <- fixity, k1 : k2 : ks <- fields =
-      foldl' showApp (showInfix name fy k1 k2) ks
+      foldl' showApp (showInfix cname fy k1 k2) ks
     | otherwise = foldl' showApp (showCon cname) fields
     where
-      cname = case fixity of
-        Prefix -> name
-        Infix _ _ -> "(" ++ name ++ ")"
+      cname = surroundConName fixity name
       fields = gPrecShowsFields p x

 instance GShowNamed p f => GShowC p ('MetaCons s y 'True) f where
   gPrecShowsC p name fixity (M1 x) = showRecord cname fields
     where
-      cname = case fixity of
-        Prefix -> name
-        Infix _ _ -> "(" ++ name ++ ")"
+      cname = surroundConName fixity name
       fields = gPrecShowsNamed p x

+surroundConName :: Fixity -> String -> String
+surroundConName fixity name =
+  case fixity of
+    Prefix
+      | isSymName -> "(" ++ name ++ ")"
+      | otherwise -> name
+    Infix _ _
+      | isSymName -> name
+      | otherwise -> "`" ++ name ++ "`"
+  where
+    isSymName = isSymDataCon name
+
+isSymDataCon :: String -> Bool
+isSymDataCon (':':_) = True
+isSymDataCon _       = False
+
 class GShowFields p f where
   gPrecShowsFields :: p (ShowsPrec a) -> f a -> [PrecShowS]

Before I make PRs out of these, however, there is a design question that I'm unclear on: I'm a bit inconsistent with how I parenthesize/backtick names of things. In showField, for instance, I assume that the field name is unparenthesized and parenthesize it internally. On the other hand, the showCon/showInfix functions (whose call sites in Generic.Data.Internal.Show I modify) don't think about parenthesization at all and just assume that the caller takes care of the parentheses.

This seems woefully inconsistent to me, and I feel like I should adopt a single style throughout show-combinators/generic-data. Which style would you prefer @Lysxia?

Lysxia commented 4 years ago

Thanks a lot for digging into this!

I was just wondering about the same question. I'm in favor of having users of show-combinators deal with it explicitly, and push that logic to generic-data. So only the showRecord patch would remain in show-combinators.

RyanGlScott commented 4 years ago

Good to know. In that case, I'll start by submitting a PR to show-combinators (Lysxia/show-combinators#4), and then wait for a new Hackage release with that fix so that I can update the generic-data test suite with a regression test like T3.

RyanGlScott commented 4 years ago

31 should address the remaining parts of this issue.