mauke / data-default

A class for types with a default value
45 stars 16 forks source link

Remove most instances #20

Open mitchellwrosen opened 5 years ago

mitchellwrosen commented 5 years ago

My opinion: def is a useful concept for blobs of optional parameters, configs, and perhaps other domain-specific things, but not for numeric types, lists, Maybe, etc.

The situation today is, from my vantage point: either people avoid this type class entirely (e.g. https://github.com/google/proto-lens/issues/194), or they use it with some caveats like, try not to abuse it.

I propose we try to correct this by making a major version bump to 2.0 and removing most instances. The only ones I believe belong are the derived tuple instances like

instance (Default a, Default b) => Default (a, b)

What do you think? Is this a feasible or wanted change? Thanks :)

moll commented 5 years ago

I find zero to be a relatively sane default and for the rare record type cases where it's not applicable, overriding the derived def implementation is a keystroke away. I'm more likely to stumble upon https://github.com/mauke/data-default/issues/18 than ask for less defaults.

Doesn't the argument against Data.Default primarily consist of it not being tied to Monoids?

mitchellwrosen commented 5 years ago

@moll First just to clarify - are you a maintainer of this package or just throwing in your 2c?

moll commented 5 years ago

Not the maintainer. A mere user.

mitchellwrosen commented 5 years ago

Ah, gotcha. Well, to elaborate a bit, lawless type classes are already suspicious. The only benefit is reusing a symbol name.

One example of a decent lawless type class is Show. I'd really hate to have to type Maybe.show, Either.show, etc.

def has the potential to be another useful common symbol name for "default thing", but there is a huge cost to this. Ad-hoc polymorphism makes refactoring dangerous, because the compiler will not tell you when instance resolution switches from one type to another.

A frustrating (and somewhat uncommon) occurrence is using Foldable tech at the Maybe type, then at some point it starts getting called at the (a, b) instance instead. That's really never what you want.

def has the same exact danger, but it currently covers a ton of common types. I don't think the benefit of having an alternative name for False, 0.0, (), etc. is worth this potential hazard, at all. So currently, I just completely avoid this type class and I recommend others do as well.

But there is a decent middle-ground: just have the type class exist on Hackage, as a public good, but let users define their own instances in their applications, since it's a useful name for a default blob of optional arguments. The danger of accidentally resolving to a /different/, /unintended/ default blob of optional arguments during a refactoring is low, but crucially, this relies on def not having many instances out of the box.

@ndmitchell since an old issue of yours was brought up, I would love to hear your thoughts on this :)

ndmitchell commented 5 years ago

To me a type class assigns a reusable name to a concept. Some concepts have laws, some have intuition, some have both. Of course, we'd like both, but I find those with only intuition better than those with only laws. I think the term "lawless typeclass" is banded about in Haskell like an insult, when "conceptless typeclass" should be the real insult.

I think Default is a sensible concept, present in many other languages (e.g. C#). And in those languages, [], Maybe, Int, Double, Bool all have a default, and the awesome thing about this concept is it's so well understood that absolutely everyone knows what those defaults are - that to me shows its a conceptual typeclass.

There is a standard problem with type classes that you might think of a specific behaviour and get a general one - very true. In fact, at previous $COMPANY I worked at we banned show, because people were doing show on Int to write an on-wire representation, then a refactoring made it a Maybe Int, and oh dear we have Just 1 going out on the wire. Of all the examples you describe, def is the least likely to cause problems, since it is value only, not behaviour.

If you add Default without instances for what is in base, then due to orphans effectively no library can ever add instances for Int. It's taking the type class, banning all the instances I want, then shoving it out in the world. I really dislike this idea. It would make the Default class useless for my purposes and I'd stop using it.

I do think the dependencies of default should only be base, and the instances supplied should only be for things in base. I think that would be a worthy 2.0.

joeyh commented 5 years ago

I would rather not have to learn any special cases about def, so would prefer it to always behave identically to mempty for core data types, while at the same time being able to define Default instances for my domain-specific types that I don't need to be Monoids.

Conversely, it does not seem to make sense to remove Default from anything that is a Monoid, because whatever trouble a user can get into with def they could just as well experience with mempty.

mitchellwrosen commented 5 years ago

@ndmitchell

If you add Default without instances for what is in base, then due to orphans effectively no library can ever add instances for Int. It's taking the type class, banning all the instances I want, then shoving it out in the world. I really dislike this idea. It would make the Default class useless for my purposes and I'd stop using it.

That's exactly the idea - that Default Int just doesn't exist, and never will. I have never needed this concept, so I would like it eliminated. This strictly increases the number of errors GHC will raise during a refactoring, which is my primary concern, since we're all refactoring our code all the time. The lawlessness of the type class is, well, a smell, but not itself a great reason to get rid of it.

Equipping a community type class with an instance is just such an important thing to get right in Haskell, every single one deserves much deliberation. Just today I was puzzling over why PartialOrd from lattices does not seem to agree with Ord in all cases, and that seems odd and unintuitive.

So back to the concept of "default" - it exists in many other languages, but is it actually a useful concept to attach to a type? I have a "verbose" flag, okay the correct default is False. But name the flag "dry-run" and now the correct default True. Number of threads? 1. Hostname? "localhost". You get my point.

The only time I've ever wanted to use the "default" concept is for large, cumbersome records of options, for power users to modify if they want, and actually writing the Default instances for these large records themselves don't even benefit from Default instances for common types, unless every Int inside should default to 0!

In Haskell, given we don't have row types nor optional arguments, this type class is one of the cleanest ways of passing a default options blob in a consistent and intuitive way. The problem is all of these pesky instances for simple base types ;)

(Btw, I know our experiences differ, since you said the very instances I want to eliminate are the ones you find most useful about this type class. I respect that, I'm just elaborating more on my point).

treeowl commented 5 years ago

My opinion probably shouldn't hold too much weight, but I don't really see the need to have a Default class at all. Is it really too much trouble to define these default values completely separately? If you decide at some point that you want to change one of the defaults, refactoring is easier if you can give the new version a new name.

ocramz commented 5 years ago

As of today, data-default has 563 reverse dependencies (http://packdeps.haskellers.com/reverse/data-default). It may be that a number of them are stale, but there are also a number of prominent packages on that list, such as hakyll, hlint, xmonad, yesod etc.

ndmitchell commented 5 years ago

I have often needed Default Int and Default String. You can make them disappear for good, but then I'll switch to data-default-with-instances-for-base, because that is why I use Default. I agree that making def as a common name for bags of arguments makes a lot of sense, but I don't see how Default Int harms that? I agree that defining a record of options to default each element to def would be a terrible idea - so don't do that. I think "" is the only sensible default string. If you want to have a type that expresses network interfaces we have a way to do that - newtypes! You can't overuse def by passing it as the Bool for dryrun and verbosity, but aren't you railing against boolean blindness as much as def there?

benjamin-hodgson commented 5 years ago

I think Default is a sensible concept, present in many other languages (e.g. C#). And in those languages, [], Maybe, Int, Double, Bool all have a default, and the awesome thing about this concept is it's so well understood that absolutely everyone knows what those defaults are - that to me shows its a conceptual typeclass.

(I’m a C# programmer in my day job so I feel qualified to chime in here.) C#’s default exists because it’s the value that variables get if you don’t assign them a value. Every type has a default and programmers can’t override the default value for a type; operationally it’s "whatever you get when you zero out all of the type’s memory". For the vast majority of types the default is null; it’s only unboxed value types (structs) for which the default is semantically a value. (Not every struct has a meaningful default, eg you might want to implement Either<A, B> as a struct, but it has a default nonetheless.)

To the extent that we can agree that null was a bad idea, default was also a bad idea. It’s an implementation detail leaking through because C# is an imperative language. Everyone knows what the defaults are is because they have to know to be productive, not because “every type has a default value” necessarily makes conceptual sense.

endgame commented 5 years ago

To those who are saying they value Default instances for Int, String, etc: would mempty be a more principled substitute? If I want to "default" something, it's usually because I don't want that value to affect anything when I combine it with other stuff, i.e., it's the mempty for some Monoid I'm probably already using.

duplode commented 5 years ago

@ocramz Another notable reverse dependency is diagrams, which relies on it for its with trick.

duplode commented 5 years ago

I feel this discussion might benefit from concrete examples. Illustrations of how you folks have been using Default Int and Default String in the wild would be much appreciated. The same goes for examples of how the polymorphism of def has bitten you by allowing bugs to slip by unnoticed. (Pinging @ndmitchell and @mitchellwrosen as the most vocal supporters of each of the camps here.)

mitchellwrosen commented 5 years ago

@duplode Sure, I can try, but I think the Github :+1: have spoken, this was just a bad take and I am happy to yield.

My argument against def's polymorphism is just an argument against ad-hoc polymorphim itself. It's exactly as dangerous as using mempty as an alias for Map.empty, since refactorings can go wrong. Is mempty wrong? Of course not... more on that in a moment.

To give a concrete example, consider some function like

foo :: Int -> IO ()

Later, I refactor foo's to have type

foo :: (Int -> IO ()) -> IO ()

and code that was prevously written as

foo def

will continue to compile and run, but I would rather have been forced to hunt down all uses of foo.

The exact same thing could happen using any polymorphic function like mempty, which I also don't advise for exactly the same reason. You should write [] instead of mempty where possible.

The difference between mempty and def is that mempty comes as a package with mappend, forms an algebraic concept with laws, and allows us to write highly useful code that works for any monoid.

def-polymorphic code is suspicious, I'm willing to go as far as saying it's wrong, or at the very least, weird.

The one use I find def useful for in Haskell is for custom, domain-specific argument blobs, and nothing else. As mentioned above, it's a nice-ish way of working around the lack of built-in optional arguments and row types. This is a compromise: yes refactorings can go wrong using def, but they are less likely to do so if there are not so many instances on the type class.

Using def as, by convention, how we design APIs that have blobs of optional arguments would just simplify the ecosystem overall, since libraries have to solve this problem somehow (usually by just defining a one-off symbol like defaultOptions, or embracing def).

All that said, clearly there is disagreement, just because I don't see a good reason to use def does not mean there isn't one. @ndmitchell has said multiple times that he finds uses for Default Int, Default String, etc.

So since we only have the one type class, and we all have to share it and get along, perhaps it shouldn't be violently refactored unless everyone is in complete agreement that it'd be a good idea :)

mitchellwrosen commented 5 years ago

@duplode And to respond to your other comment, diagrams use of def is fine, concordant with my proposal. I'm not saying let's axe the type class from Hackage altogether. It would yuck up diagrams' API quite a bit if you had to write defaultArrowOpts, defaultStrokeOpts, etc. so I appreciate its usage there.

duplode commented 5 years ago

@mitchellwrosen Thanks. This comment dug up from an old r/haskell thread is similar in spirit, being about the surprises instance Default r => Default (e -> r) might bring. I'm actually sympathetic towards the stance of leaving Default for domain specific things, though as you suggest in your final paragraphs there are other factors to consider before having it "violently refactored". (In any case, I would still like hearing more about the use cases of the instances for the types from base.)

moll commented 5 years ago

While I too find the function instance of Default to be in no way intuitive, I think some of the critiques of Default laid out above aren't so much the fault of def than they are pitfalls of relying on type inference (in the presence of polymorphism). Whenever a type with, for example, Default, Num or IsString instance is at play with no explicit type, there's a risk of a semantically incompatible change in the callee signature to go unnoticed. They are, however, extremely convenient, especially in tests where there are plenty of literals.

ndmitchell commented 5 years ago

I agree with @mitchellwrosen about relying on type classes - but I think it's more harmful for things like show - in particular because show erases type information. If you send in an Int but were expecting a OptionsPragma you never find out later - you just get the wrong thing showing up in the output. I agree you aren't often polymorphic in def in a "large scale" - it tends to be polymorphism that functions more like a macro going over a handful of lines at most. But that's still convenient.

As an example of where I use Default Int is for cmdargs - people write {foo = def &= ... so they can ignore what the def is. I appreciate it's not earth shattering inconvenience to write 0 there, but it is useful if you start with a default Int, then move to a Maybe Int, then finally to a [Int]. There are also internal commercial uses.

I would be very happy if Haskell copied Rust, moved Default into the base library, and provided instances for everything. Then we could really know that def is always the default set of options. See https://doc.rust-lang.org/std/default/trait.Default.html

alexanderkjeldaas commented 5 years ago

@mitchellwrosen

The exact same thing could happen using any polymorphic function like mempty, which I also don't advise for exactly the same reason. You should write [] instead of mempty where possible.

I think this is the core issue for me, because I disagree with this sentiment. Default is not about mathematical correctness, it's about programmer productivity. I don't want to maximize the number of potential, but unlikely to be problematic, compile errors during refactoring, because it's not time well spent.

I want a trade-off. If it's likely that there is an error during refactoring, I want the compiler to tell me. If it's highly unlikely, I don't want the compiler to bother. Default is a concept I use exactly to avoid the compiler complaining about things where it frankly should keep it's mouth shut.

I use def mostly for derive Default-style defaults in my data structures, and in the cases where I use def I will never refactor Int -> IO () to (Int -> IO ()) -> IO (). I will happily hunt down this bug if it ever arises, because there is an expected gain in efficiency for me by using Default instead of the type-specific initializers.

To me, this is yet another tool in the toolbox, with some trade-offs. If I need to edit 500 places in the code anyways, because I have used specific initializers, I'll use git grep and a perl regexp replace, and the benefit is no more as there is no longer a conscious decision being made.

(though #17 should be fixed for Default to be usable for me).