Open kytrinyx opened 2 years ago
The solution to Valentine's is feature-complete, so to speak. It matches on a custom ADT (Activity
), matches on fields (e.g. Korean
), uses binding patterns (not sure this is the right terminology) (kilometers
), and it uses guards. All this is very normal Haskell.
Here are a few variations of a translation of the F# exemplar solution. (The variation is in syntax alone.)
This is really nice.
My gut feeling is that the middle solution is just right at this point in the concept tree.
I do like the clean approach of the first one, but I think that we should probably wait to introduce language extensions. Similarly I think it's worth deferring the function definition syntactic sugar until later in the concept tree.
My gut feeling is that the middle solution is just right at this point in the concept tree.
I agree. The third one is not any more complicated (depending on their introduction to the language, a beginner might even find it simpler), but it is noisier.
I think that we should probably wait to introduce language extensions
While I agree on this, a warning: the language extensions are very diverse, and so are their appropriateness, need, and difficulty level. Intuitions from other languages about them do not necessarily transfer. To paraphrase Alexis King: the question is less «is introducing language extensions a good idea?» and more «is using this language extension a good idea?».
I think it's worth deferring the function definition syntactic sugar until later in the concept tree.
Many Haskell tutorials start out with the sweet version and introduce case
expressions only later. (Case in point: LYAH has them at the bottom of the page on pattern matching.)
I think it's safe to introduce them together. Wouldn't know when to do it otherwise either: PM2 is already kinda heavy, and I do not feel like this syntactic sugar deserves its own concept node.
The third one is not any more complicated [...] but it is noisier.
Yes, that's my sense, exactly. The three actually feel really similar to me in terms of difficulty, but I think the middle example strikes a good balance.
To paraphrase Alexis King: the question is less «is introducing language extensions a good idea?» and more «is using this language extension a good idea?».
Yepp, that makes sense.
Cool. So, if @ErikSchierboom also agrees that the middle example is the way to go, I think we'd be good to kick off work on the Valentine's Day exercise itself.
@MatthijsBlom Have you given any thought yet to how you'd introduce Algebraic Data Types?
I think we'd be good to kick off work on the Valentine's Day exercise itself.
It looks like a straightforward translation of F#'s instructions.md
would be good.
@MatthijsBlom Have you given any thought yet to how you'd introduce Algebraic Data Types?
Working on it right now. (On that whole graph section actually.)
It occurs to me that Maybe
, Either
, tuples, … do not actually depend on Simple Pattern Matching.
It occurs to me that Maybe, Either, tuples, … do not actually depend on Simple Pattern Matching.
So your saying you wouldn't use pattern matching when working with them?
Cool. So, if @ErikSchierboom also agrees that the middle example is the way to go, I think we'd be good to kick off work on the Valentine's Day exercise itself.
I agree!
The solution to Valentine's is feature-complete, so to speak. It matches on a custom ADT (Activity), matches on fields (e.g. Korean), uses binding patterns (not sure this is the right terminology) (kilometers), and it uses guards. All this is very normal Haskell.
Which of these concepts will be prerequisites and which of these concepts will be newly introduced concepts?
@MatthijsBlom Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph here (First Maybe
, Either
, ADTs
, but also any of the "etc" concepts that you have on the right hand side as well, if you've been thinking about that).
So your saying you wouldn't use pattern matching when working with them?
I would regularly, but a beginner need not immediately. The utility functions in their respective modules are sufficient for everything, if in a slightly different way. Seems fine practice in composing functions.
...But that's not really what that was about: Maybe
, Either
, and (,)
will not be matched against until Pattern Matching Proper. Everything else about them does not depend on pattern matching on literals, which is what Simple Pattern Matching is about.
I tried to say that the graph might look like this instead:
Simple Pattern Matching
|
Maybe Either | ... (tuples, lists?)
\ | | /
Algebraic Data Types
Which of these concepts will be prerequisites and which of these concepts will be newly introduced concepts?
Guards are part of Simple Pattern Matching; the rest will be introduced in Pattern Matching Proper.
@MatthijsBlom Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph
Not yet, but these are the kinds of signatures I'm thinking of:
binarySearch :: Array -> Value -> Maybe Index
compile :: Code -> Either CompileError Program
ADTs: the data type declarations in Valentine's Day contain most of this. The only thing that is really missing is polymorphism: data Maybe a = Nothing | Just a
(the lowercase a
, a type variable).
Tuples are plumbing, so I do not expect appealing code. However, the likes of zip :: [a] -> [b] -> [(a, b)]
and lookup :: Eq a => a -> [(a, b)] -> Maybe b
use them.
Looks like lists will get their own node early on. I haven't checked this, but I imagine that lots of material from other tracks on lists can be reused. (Expecially Purescript, Elm.)
Do you have some snippets of code as starting points for discussion for some of the earlier concepts in the graph
As a quite rusty Haskell programmer, code samples would be great, yes. I must admit that things don't readily "click" in my mind currently, but I'm sure I'll get there. I think the code samples will be invaluable.
This is taking way longer than expected, and @pwadsworth is being productive in my absence, so I tried to write down my thoughts here below.
Here is an attempt at a somewhat thorough outline of things related to pattern matching and data types in Haskell. Explanations are omitted, as are obviously 'advanced' concepts like GADTs. Some such concepts however are included, simply because it was easier for me to include than to omit them.
case
expressions:
f x =
case x of
pattern -> expression
Function definition syntactic sugar (compiles to case
)
f pattern = expression
(Please disregard GitHub's broken highlighting of pattern
.)
x = case someExpression of
-- Wildcard pattern
_ -> expression
-- Binding pattern
name -> expression
-- Regular constructor pattern
Constructor pat ter ns -> expression
-- (Named) "as" pattern
name@pattern -> expression
-- Special syntax reused as pattern syntax
5 -> expression -- integers
-3.7 -> expression -- fractional numbers
"Hello!" -> expression -- strings
[x, y, z] -> expression -- lists
(x, y) -> expression -- tuples
-- Bang pattern
!pattern -> expression
-- Irrefutable pattern
~pattern -> expression
-- Guards
pattern | booleanExpression -> expression -- boolean guard
| pattern <- expression -> expression -- pattern guard
| let name = expression -> expression -- local binding
| guard, guard, guard -> expression -- sequence of guards
-- "data declaration"
data TypeConstructor typeParameters =
DataConstructor fi El Ds
| DataConstructor' Fi eld S
-- "newtype declaration"
newtype TypeConstructor typeParameters =
DataConstructor Field
-- "type declaration"
type TypeConstructor typeParameters = someType
-- deriving clause
data T a b c = C a b c deriving (classes)
newtype N a b c = N a b c deriving (classes)
Of note:
data
, newtype
declares un-lifted types.type
does not provide type-safety. (To the contrary, in practice!)TBA
Right now I do not feel need to fill in this section. See below for reasoning.
Note: while they look as similar to each other as some other constructs shared between the languages, the records of Haskell and Elm are not to be confused: they are different beasts.
Bool
, Maybe
, Either
(,)
, (,,)
, etc.)[]
)Each of these except Bool
are parametric types, i.e. not themselves concrete types.
Maybe
, []
, (,)
, etc. are type constructors.
False
, Just
, []
, (:)
, (,)
, etc. are data constructors
Note: sadly tuples and lists are not great examples, because their syntax is special on many levels (syntactic sugar + weird names of both type and data constructors).
Each of these types can be worked with, even without (explicit) pattern matching, using the utility functions available from dedicated standard library modules.
I propose we do not teach
I have marked these with 🛑 in the condensed list above.
In addition, I wouldn't be too sad if we skip
newtype
declarations as they are conceptually (almost) redundant. Seems cheap to include them though.type
declarations / type synonyms as they are conceptually anti-interesting, not that common, and potentially confusing. Reasons to include them anyway: cheap, and the very common String
is (very noticeably) a type synonym.deriving
clauses because type classes have not been introduced yet. They are very common though, so it would be nice if we could at least say something about them.I have marked these 😢 in the condensed list above.
Now, assuming the concept graph proposed earlier, …
…I was thinking
Concept | Node |
---|---|
case expressions |
Simple PM |
Wildcard patterns | Simple PM |
Special syntax reused as pattern syntax | Simple PM |
function definition syntactic sugar | Simple PM or PM Proper[^delay-func-pm] |
Binding patterns | Simple PM or PM Proper[^bigu] |
boolean guards | Simple PM[^gu-sim] or PM Proper[^bigu] |
type variables + parametric types | example nodes + ADTs |
data constructors | example nodes + ADTs |
type constructors | example nodes + ADTs |
data declarations |
ADTs |
newtype declarations |
ADTs |
type declarations / type synonyms |
ADTs |
deriving |
ADTs |
Regular constructor patterns | PM Proper |
"As" patterns | PM Proper |
[^delay-func-pm]: Consider delaying introduction of function definition pattern matching syntactic sugar, because beginners often get stuck in the habit of using the sugar, whereas the first instinct should be to reach for case
.
[^bigu]: One generally guards on bindings, so we should introduce binding patterns no later than we introduce guards.
[^gu-sim]: Beginners often reach for series of (==)
comparison guards instead of case
. If introduction of guards is postponed, there is no danger of this: in more general settings (i.e. those in PM Proper), (==)
guards fail to be applicable.
I realize these ideas conflict with ongoing work. What do you think about the above, @pwadsworth?
If it helps, I can produce some design.md
s. I think I finally understand why those exist; it wasn't obvious to me before.
@MatthijsBlom Wow, that is a lot of information! As mentioned, we (the Exercism team) would like to have the next step be some sample code for the various concepts. @pwadsworth would you be interested in writing some sample code for the various concepts?
edit: lets focus on 1 concept at a time, preferrably the one with the least amount of prerequisites
I'm guessing you mean code samples for the Example Nodes, as a Simple Pattern Matching has already been merged.
lets focus on 1 concept at a time, preferrably the one with the least amount of prerequisites
That would be (,)
(2-tuples). Not because it is the simplest, but because it is the most familiar (to those coming from other languages, anyway). Next up would be the similarly simple Maybe
, followed by its generalization Either
.
Anyway, I failed at this. What follows is the same treatment for all three of these types. If this is bothersome, I can edit some out.
I had previously published a few solutions that 'use' Maybe
, Either
, and tuples.
Maybe
Maybe
Maybe
Either
Maybe
(,)
(PureScript and Haskell are closely related dialects; the few differences do not matter here.)
Some observations on the use of Maybe
/ Either
/ (,)
values in these solutions:
<$>
, <*>
, and >>=
.(,)
is entirely intermediate, for immediate consumption by fromList
.All this is normal in Haskell (and, I suspect, in PureScript as well).
(Are there other possible 'uses', besides construction and inspection?)
We tend to avoid creating functions that take these types as arguments. Most cases by far are covered by plumbing functions like <$>
and (derivatives of) maybe :: b -> (a -> b) -> Maybe a -> b
, either :: (a -> c) -> (b -> c) -> Either a b -> c
, and uncurry :: (a -> b -> c) -> (a, b) -> c
.
All three of Maybe
, Either
, and (,)
are plumbing-like (in practice), and (,)
especially so.
I previously mentioned that a typical type signature with Maybe
would be binarySearch :: Ord a => Array a -> a -> Maybe Index
. Indeed, this is my solution:
As expected, the actual implementation hardly adds to the Maybe
story: the signature already tells it all.
While the above examples may show typical use of Maybe
/ Either
/ (,)
, they may well be not very instructive. So here are rough adaptations of Elm's [Role Playing Game][elm-role-playing-game], [Go][elm-go], and [Tisbury Treasure Hunt][elm-tisbury-treasure-hunt]. (I have not tried to restrict myself to already-introduced concepts, or any specific style.)
Maybe
)Either
)(,)
)I'm having some trouble coming up with examples of tuple use that are both real and interesting / nontrivial / notable / something. Maybe their use in State
?
State
Tuples are used here (as they are always) to group multiple values into one so that they can be returned together. Tuples are the tool for grouping multiple values into one without attaching any extra meaning to this grouping.
@ErikSchierboom are there specific kinds of sample code still missing?
[ps-collatz-conjecture]: https://exercism.org/tracks/purescript/exercises/collatz-conjecture/solutions/MatthijsBlom "Matthijs Blom's solution to Collatz Conjecture in PureScript" [ps-pascals-triangle]: https://exercism.org/tracks/purescript/exercises/pascals-triangle/solutions/MatthijsBlom "Matthijs Blom's solution to Pascals Triangle in PureScript" [ps-largest-series-product]: https://exercism.org/tracks/purescript/exercises/largest-series-product/solutions/MatthijsBlom "Matthijs Blom's solution to Largest Series Product in PureScript" [hs-nucleotide-count]: https://exercism.org/tracks/haskell/exercises/nucleotide-count/solutions/MatthijsBlom "Matthijs Blom's solution to Nucleotide Count in Haskell" [hs-perfect-numbers]: https://exercism.org/tracks/haskell/exercises/perfect-numbers/solutions/MatthijsBlom "Matthijs Blom's solution to Perfect Numbers in Haskell" [hs-etl]: https://exercism.org/tracks/haskell/exercises/etl/solutions/MatthijsBlom "Matthijs Blom's solution to Etl in Haskell" [elm-role-playing-game]: https://exercism.org/tracks/elm/exercises/role-playing-game "Elm track: Role Playing Game" [elm-go]: https://exercism.org/tracks/elm/exercises/go "Elm track: Go" [elm-tisbury-treasure-hunt]: https://exercism.org/tracks/elm/exercises/tisbury-treasure-hunt "Elm track: Tisbury Treasure Hunt"
@MatthijsBlom Sorry that I've not yet replied. I'm incredibly swamped right now :(
Sorry for the delayed reply. I have been super busy and haven't had any time to dedicate here. I expect to have more time available in late November and December. If we settle on the way forward before then I'll be able to write several exercises and related docs.
@MatthijsBlom: IRT conflict with what we have already done, I do not have any issue adjusting to improve the learning path, but what you proposed has many concepts in each node. I thought the idea was to keep the number of concepts to one or two per node to make each a small incremental step for students. I can see benefits in clarity of exposition, and nodes being self-contained though, so I have no problems with expanding the nodes already written if the Exercism team wants to take that route. I do have a couple minor shown below in bold on where some concepts are better introduced.
@ErikSchierboom: Assuming the above, here is what I propose to do for each.
Concept | Node | Comment |
---|---|---|
case expressions | ADT | (1) Used in valentines-day (VD). I can add a more detailed explanation in the introduction |
Wildcard patterns | ADT | Same as (1) |
Special syntax reused as pattern syntax | Simple PM | @MatthijsBlom Not sure what you mean by this |
function definition syntactic sugar | Simple PM or PM Proper1 | PM Proper |
Binding patterns | Simple PM or PM Proper2 | PM Proper |
boolean guards | Simple PM3 or PM Proper2 | PM Proper |
type variables + parametric types | example nodes + ADTs | Not sure about this one. It could be its own node or added as part of Types or Type Classes (proposed future nodes) |
data constructors | example nodes + ADTs | Covered in ADTs (VD). |
type constructors | example nodes + ADTs | Can add to ADTs and make a separate exercise. Code Example 1 to follow. |
data declarations | ADTs | Covered in VD. |
newtype declarations | ADTs | Can add to ADTs and make a separate exercise. Code Example 2 to follow. |
type declarations / type synonyms | new Types node | Code Example 3 to follow. |
deriving | ADTs | Recommend leaving for a later node on type classes |
Regular constructor patterns | PM Proper | PM Proper |
"As" patterns | PM Proper | PM Proper |
I thought the idea was to keep the number of concepts to one or two per node to make each a small incremental step for students.
This is spot on. We don't want many concepts explained in a single exercise. If we build an exercise and it turns out that there are many concepts it introduces, we should then introduce more exercises (which will serve as prerequisites) to introduce some of those concepts.
So what would be a logical next step? We have basics
, numbers
, booleans
, pattern-matching-literals
and algebraic-data-types
as concepts.
I thought the idea was to keep the number of concepts to one or two per node
It cannot be that simple, as concepts are often neither atomic nor uniformly granular. It does not make as much sense as I would like to speak of numbers of concepts; one should instead think of concept weights.
Of course, thinking of weights has problems, chief among them that it is subjective. Still, I do not see a better way of thinking about it.
So what would be a logical next step? We have […],
pattern-matching-literals
andalgebraic-data-types
as concepts.
Neither of these cover single concepts (even though their names suggest they do):
pattern-matching-literals
treats
algebraic-data-types
treats
data
declarationscase
expressionsAs illustrated by my table, if it were up to me the syntactic sugar would be postponed, the binding patterns would too (as absent both sugar and case
they stop making sense), guards would (maybe) be postponed, and data
declarations would be separated from pattern matching (as, contra Peter, I think they are separate (and both heavy) concepts).
Which is to say: I don't think we have these yet. (Unless it is decided that we actually do and should move on.)
So what would be a logical next step?
A bit off-topic, but: lists (excluding pattern matching on them) could be introduced independently, and _ where { }
and let { } in _
could use early introduction. Maybe more; can't think of anything right now.
Regarding the pattern matching concept-cluster: assuming the current repo state, I don't know.
Going through @pwadsworth's table:
Concept Node Comment Special syntax reused as pattern syntax Simple PM @MatthijsBlom Not sure what you mean by this
See the large code block listing all the kinds of patterns above (page-search -- Special syntax reused as pattern syntax
). Presently this is treated by the pattern-matching-literals
(i.e. "Simple PM") node, which I agree with.
Concept Node Comment function definition syntactic sugar Simple PM or PM Proper1 PM Proper Binding patterns Simple PM or PM Proper2 PM Proper boolean guards Simple PM3 or PM Proper2 PM Proper
Are you sure? All of these are presently treated by pattern-matching-literals
, which you made.
Concept Node Comment case expressions ADT (1) Used in valentines-day (VD). I can add a more detailed explanation in the introduction Wildcard patterns ADT Same as (1)
Do you mean for "ADTs" to come after "PM Proper"? The exemplar Valentine's Day solution uses guards and binding patterns. (I'm assuming you mean to keep Valentine's Day's association with algebraic-data-types
.)
Concept Node Comment type variables + parametric types example nodes + ADTs Not sure about this one. It could be its own node or added as part of Types
orType Classes
(proposed future nodes)
This seemed to me like a concept hard to explain without lots of examples. But since we were going to introduce Maybe
, Either
, (,)
, etc. anyway, I thought why not use them to soft-introduce this concept? That way, later on we can tell the student «see, you've seen this before!» before generalizing. For an example of how I'd tackle the first half of this strategy, see #1106.
Concept Node Comment type declarations / type synonyms new Types node Code Example 3 to follow.
Are you sure? I mean declarations of the form type Name = Type
. Seems excessively light for a node.
Concept Node Comment deriving ADTs Recommend leaving for a later node on type classes
I agree, but we cannot avoid deriving
completely: it is too common, and necessarily present in exercise stubs. We should at least note something like «we'll explain fully what they do when we cover type classes, but for now understand that deriving
clauses like this ensure we can use the ==
operator and that we can show our values on the screen».
Code Example 1 to follow. Code Example 2 to follow. Code Example 3 to follow.
@pwadsworth You forgot to include these.
what you proposed has many concepts in each node.
Guilty as charged. However, I have tried to reasonably distribute weight among the nodes:
case
expressions (or function definition syntactic sugar) are the only heavy thingy in "Simple PM", the rest all being light.data
declarations are heavy; the rest in "ADTs" is light or has been heavily foreshadowed in the "Example Nodes".@
patterns are light; there is room left for some or all of the light stuff marked with "or" – which is fortuitous as I would like to delay two of them.I thought I'd mentioned it somewhere, but I cannot find it anymore: I have thought about moving guards into their own node. It might keep students from developing bad habits a bit, and make the rest of the pattern matching nodes lighter. It would also allow to go more in-depth on guards.
@MatthijsBlom Thank you for this in-depth exploration into your thinking on the subject.
Let's start with the exercise for Pattern Matching Proper, making the assumption that Algebraic Data Types, Maybe, and Either have all been explained before.
Then let's work backwards from there. If we find that it becomes necessary, we can always create additional prerequisite exercises later.
If I recall correctly, we settled on using Valentines Day for the story for Pattern Matching Proper.
@MatthijsBlom @pwadsworth Does this sound like it will work?
Does this sound like [using Valentine's Day for the story for Pattern Matching Proper] will work?
To me it does. As I noted before, Valentine's Day is 'feature complete': it has (the most common) patterns, boolean guards, and a data
declaration to interpret.
hey folks, sorry to jump in here but I'm interested in helping out with getting Haskell learning mode + concept exercises built out. Are there any clear next bits of work that I could tackle or other preparation work I could help with?
It seems like we still need some exercises for Maybe
, Either
and tuples. Do we think that any of these options that @MatthijsBlom mentioned would be good candidates?
I had previously published a few solutions that 'use' Maybe, Either, and tuples.
- (PureScript) Collatz Conjecture: Maybe
- (PureScript) Pascals Triangle: Maybe
- (PureScript) Largest Series Product: Maybe
- (Haskell) Nucleotide Count: Either
- (Haskell) Perfect Numbers: Maybe
- (Haskell) Etl: (,)
It seems like we still need some exercises for
Maybe
,Either
and tuples.
Yes, we very much do.
I have been working on the concept introductions with some success, but have had trouble finding compelling & suitable exercises to go with them. #1106 contains a draft for the introduction of Maybe
. An exercise to go with it would be most welcome. My intent was to focus on using the constructors to construct Maybe
s, and using Prelude
& Data.Maybe
functions to use Maybe
s. Pattern matching hasn't been introduced yet, so we cannot use it.
I have more draft work (Either
+tuples+lists introductions + list exercise at least) locally, but it is not in a pushable state and I will not have the opportunity to clean things up until a few weeks from now.
Do we think that any of these options that @MatthijsBlom mentioned would be good candidates?
No. It is customary to use dedicated Learning Exercises for the syllabi, and these are all Practice Exercises. Learning Exercises are typically extremely simple and hyperfocused on the concept in question. We could adopt existing stories, or craft our own.
Got it, thanks for explaining. I will try to take a look at the existing stories to see if there are any we could use or take inspiration from for crafting our own if we have to.
OK after looking through the list I couldn't see anything obvious that would be a good candidate, so perhaps it's better to come up with something ourselves.
For constructing Maybe values, there are a couple of simple and canonical ideas:
safeHead
/safeLast
function which returns a Maybe
if the list is empty (but students can use if .. then .. else ..
instead of pattern matching to solve it). Though safeHead
exists as listToMaybe
in Data.Maybe
so perhaps we can nudge students to use that in a different context to get them used to using the Data.Maybe module.safeDivide
function which returns a Maybe instead of dividing by zeroFor using stuff from Data.Maybe
I don't have anything obvious in mind but I think I could come up with some ideas if I sat down and tried to hash it out:
catMaybes
in a solutionfromMaybe
and simple helpers like isJust
/isNothing
would also be goodI like the safeDivide
idea.
Lists haven't been introduced yet, so I am a bit hesitant to require using mapMaybe
/catMaybes
. However, I like these functions, and lists are introduced in a parallel concept node. I don't think using them anyway would be a significant problem.
I do not think it is healthy to encourage using isJust
/isNothing
. I fear they might stick as bad habits. Also, we should warn explicitly against fromJust
; I will probably do this already in the concept introduction.
On my wish list are maybe
, fromMaybe
, and perhaps mapMaybe
/catMaybes
. Of the latter two, probably catMaybes
is the one to go with, as using higher order functions might be needlessly difficult (for some), and also when the student later on uses catMaybes . map f
HLint will suggest using mapMaybe
.
@MatthijsBlom suggested the following concept graph, which is a good starting point for building up to a basic understanding of proper pattern matching in Haskell.
I'd like to start the discussion by looking at some sample code that demonstrates the end point, Pattern Matching Proper. It doesn't have to relate to any exercise or story, rather it's a way to make sure that everyone involved understands where we're trying to get to.
@MatthijsBlom I know you've been thinking a lot about this over the past couple of weeks. Would you be willing to write a few (maybe 10 or so?) lines of code to illustrate Pattern Matching?