exercism / sml

Exercism exercises in Standard ML.
https://exercism.org/tracks/sml
MIT License
26 stars 33 forks source link

Building a training set of tags for sml #242

Closed ErikSchierboom closed 7 months ago

ErikSchierboom commented 8 months ago

Hello lovely maintainers :wave:

We've recently added "tags" to student's solutions. These express the constructs, paradigms and techniques that a solution uses. We are going to be using these tags for lots of things including filtering, pointing a student to alternative approaches, and much more.

In order to do this, we've built out a full AST-based tagger in C#, which has allowed us to do things like detect recursion or bit shifting. We've set things up so other tracks can do the same for their languages, but its a lot of work, and we've determined that actually it may be unnecessary. Instead we think that we can use machine learning to achieve tagging with good enough results. We've fine-tuned a model that can determine the correct tags for C# from the examples with a high success rate. It's also doing reasonably well in an untrained state for other languages. We think that with only a few examples per language, we can potentially get some quite good results, and that we can then refine things further as we go.

I released a new video on the Insiders page that talks through this in more detail.

We're going to be adding a fully-fledged UI in the coming weeks that allow maintainers and mentors to tag solutions and create training sets for the neural networks, but to start with, we're hoping you would be willing to manually tag 20 solutions for this track. In this post we'll add 20 comments, each with a student's solution, and the tags our model has generated. Your mission (should you choose to accept it) is to edit the tags on each issue, removing any incorrect ones, and add any that are missing. In order to build one model that performs well across languages, it's best if you stick as closely as possible to the C# tags as you can. Those are listed here. If you want to add extra tags, that's totally fine, but please don't arbitrarily reword existing tags, even if you don't like what Erik's chosen, as it'll just make it less likely that your language gets the correct tags assigned by the neural network.


To summarise - there are two paths forward for this issue:

  1. You're up for helping: Add a comment saying you're up for helping. Update the tags some time in the next few days. Add a comment when you're done. We'll then add them to our training set and move forward.
  2. You not up for helping: No problem! Just please add a comment letting us know :)

If you tell us you're not able/wanting to help or there's no comment added, we'll automatically crowd-source this in a week or so.

Finally, if you have questions or want to discuss things, it would be best done on the forum, so the knowledge can be shared across all maintainers in all tracks.

Thanks for your help! :blue_heart:


Note: Meta discussion on the forum

ErikSchierboom commented 8 months ago

Exercise: hello-world

Code

fun hello (): string =
  "Hello, World!"

Tags:

construct:string
construct:function
paradigm:functional
ErikSchierboom commented 8 months ago

Exercise: all-your-base

Code

infix |>
fun x |> f = f x

fun digits_to_value acc _ [] = acc
  | digits_to_value acc base (digit :: digits) =
    let
      val exponent = List.length digits
      val digit_value = digit * (round (Math.pow (real base, real exponent)))
    in
      digits_to_value (acc + digit_value) base digits
    end

fun value_to_digits acc _ 0 = acc
  | value_to_digits acc base value =
    value_to_digits ((value mod base) :: acc) base (value div base)

fun rebase_impl
  (input_base: int) (input_digits: int list) (output_base: int): int list =
  input_digits
  |> digits_to_value 0 input_base
  |> value_to_digits [] output_base

fun rebase
  (input_base: int, input_digits: int list, output_base: int): int list option =
  if (null input_digits)
  orelse (List.exists (fn (n: int) => n < 0 orelse n >= input_base)
                      input_digits)
  orelse (input_base < 2)
  orelse (output_base < 2) then
    NONE
  else
    case (rebase_impl input_base input_digits output_base) of
      []            => NONE
    | output_digits => SOME output_digits

Tags:

construct:round
construct:floating-point-number
construct:lambda
construct:case
construct:function
construct:integral-number
construct:invocation
construct:list
construct:option
construct:parameter
construct:pattern-matching
construct:recursion
construct:then
construct:tuple
construct:underscore
paradigm:functional
technique:higher-order-functions
technique:looping
technique:recursion
uses:List.length
uses:List.exists
uses:let
uses:infix
ErikSchierboom commented 8 months ago

Exercise: all-your-base

Code

(* converts base ten natural number to another base
   by taking the remainders of repeated divisions *)
fun to_digits base =
  let
      fun f 0 = NONE
        | f d = SOME (d mod base, d div base)

      fun aux f d =
        case f d of
            NONE => []
          | SOME (i, j) => i :: aux f j
  in
      rev o (aux f)
  end

(* Converts a natural number in base 'base' to base ten
   by nested multiplication *)
fun from_digits base =
  let
      fun aux (d, NONE) = NONE
        | aux (d, SOME n) =
          if d >= 0 andalso d < base
          then SOME (n * base + d)
          else NONE
  in
      foldl aux (SOME 0)
  end

(* Converts a number in base ip, represented as a list
   of digits, to base op. *)
fun rebase (ip_base, digits, op_base) =
  if ip_base < 2  orelse op_base < 2
     orelse digits = []
     orelse List.all (fn n => n = 0) digits
  then NONE
  else
      case from_digits ip_base digits of
          NONE => NONE
        | SOME n => SOME (to_digits op_base n)

Tags:

construct:add
construct:case
construct:comment
construct:equals
construct:foldl
construct:function
construct:if-then-else
construct:integral-number
construct:invocation
construct:lambda
construct:list
construct:logical-and
construct:logical-or
construct:modulo
construct:multiply
construct:parameter
construct:pattern-matching
construct:recursion
construct:tuple
construct:underscore
paradigm:functional
technique:boolean-logic
technique:higher-order-functions
technique:looping
technique:recursion
uses:List.rev
uses:andalso
uses:orelse
uses:let
ErikSchierboom commented 8 months ago

Exercise: allergies

Code

datatype allergen = Eggs
                  | Peanuts
                  | Shellfish
                  | Strawberries
                  | Tomatoes
                  | Chocolate
                  | Pollen
                  | Cats;

val allergens = [Eggs,Peanuts,Shellfish,Strawberries,Tomatoes,Chocolate,Pollen,Cats]

fun aval Eggs = 1
  | aval Peanuts = 2
  | aval Shellfish = 4
  | aval Strawberries = 8
  | aval Tomatoes = 16
  | aval Chocolate = 32
  | aval Pollen = 64
  | aval Cats = 128;

(* given a code and an allergen return true or false
   whether the specified allergen is represented by the code *)
fun allergic_to (code: int) (a: allergen): bool =
  let val bitval = Word.fromInt(aval a)
  in Word.andb(Word.fromInt code, bitval) = bitval end;

(* given a code generate a list of all allergens this code represents *)
fun list (code: int): allergen list =
  List.filter (fn a => allergic_to code a) allergens;

Tags:

construct:boolean
construct:assignment
construct:comment
construct:function
construct:integral-number
construct:variable
paradigm:functional
uses:List.filter
uses:Word
uses:datatype
uses:let
ErikSchierboom commented 8 months ago

Exercise: allergies

Code

datatype allergens = Eggs | Peanuts | Shellfish | Strawberries 
                   | Tomatoes | Chocolate | Pollen | Cats;

val allergen_values = [(128, Cats), (64, Pollen), (32, Chocolate),
                       (16, Tomatoes), (8, Strawberries),
                       (4, Shellfish), (2, Peanuts), (1, Eggs)];

local
    fun allergies' score [] acc = acc
      | allergies' score ((n, allergen)::values) acc =
        if score >= n then
            allergies' (score - n) values (allergen::acc)
        else
            allergies' score values acc
in
    fun allergies score =
        if 0 < score andalso score < 256 then
            allergies' score allergen_values []
        else
            []
end

Tags:

construct:if-then-else
construct:integral-number
construct:pattern-matching
construct:subtract
construct:variable
paradigm:functional
technique:boolean-logic
technique:recursion
uses:andalso
uses:datatype
uses:local
ErikSchierboom commented 8 months ago

Exercise: flatten-array

Code

datatype 'a nestedList = Leaf of 'a | Node of 'a nestedList list

fun flatten (Leaf n) = [n]
  | flatten (Node xs) = List.concat (map flatten xs)

Tags:

paradigm:functional
uses:datatype
uses:List.concat
uses:map
ErikSchierboom commented 8 months ago

Exercise: flatten-array

Code

datatype tree = 
    Leaf of int 
    | Node of tree list

fun flatten node =
    let

        fun flatten1(node, acc) = 
            case node of
                  Leaf x => acc @ x::[]
                | Node x => flatten2(x, acc)
        and flatten2 (nodes, acc) =
            case nodes of
                [] => acc
                | x::xs => flatten2(xs, flatten1(x, acc))
    in
        flatten1(node, [])
    end

Tags:

construct:case
construct:function
construct:infix-operators
construct:integral-number
construct:list
construct:parameter
construct:pattern-matching
construct:tuple
paradigm:functional
technique:recursion
uses:List
uses:datatype
uses:let
ErikSchierboom commented 8 months ago

Exercise: hamming

Code

exception NonEqualLengthStringsFound

fun distance (strand1: string, strand2: string): int =
let
  val pairs = ListPair.zipEq (explode strand1, explode strand2)
  fun count (f: 'a -> bool): 'a list -> int = length o (List.filter f)
in count (op <>) pairs end
handle UnequalLengths => raise NonEqualLengthStringsFound

Tags:

construct:curried-function
construct:exception
construct:invocation
construct:list
construct:local-function
construct:parameter
construct:raise
construct:string
construct:variable
paradigm:functional
technique:exceptions
technique:higher-order-functions
uses:List.filter
uses:List.length
uses:ListPair.zipEq
uses:handle
uses:let
ErikSchierboom commented 8 months ago

Exercise: raindrops

Code

fun convert n =
    if n mod 3 <> 0 andalso n mod 5 <> 0 andalso n mod 7 <> 0 then
       Int.toString n
    else
       (if n mod 3 = 0 then "Pling" else "") ^
       (if n mod 5 = 0 then "Plang" else "") ^
       (if n mod 7 = 0 then "Plong" else "")

Tags:

construct:if-then-else
construct:integral-number
construct:modulo
construct:string
paradigm:functional
technique:boolean-logic
uses:String.concat
uses:andalso
ErikSchierboom commented 8 months ago

Exercise: difference-of-squares

Code

(* The sum of the integers 1 to n is n(n+1)/2,
   so the square of the sum (1 + ... + n)**2 is n**2(n+1)**2/4.
   The sum of the squares of the integers 1 to n is n(n+1)(2n+1)/6
   (On-line Encyclopaedia of Integer Sequences A000330
   Based on the README.md file, I was going to compute the difference
   without computing either sum, as the README only asks for the
   difference.  But the test code wants the sums separately as well
   as their difference, so there is no point in being clever.
   The vagueness of the READMEs is vexing.
*)

local
    open IntInf
 in 
    fun squareOfSum n =
        let val n' = fromInt n
            val s  = (n' * (n' + 1)) div 2
         in s * s
        end

    fun sumOfSquares n =
        let val n' = fromInt n
         in (n' * (n' + 1) * (2*n' + 1)) div 6
        end

    fun differenceOfSquares n = squareOfSum n - sumOfSquares n
end

Tags:

construct:add
construct:comment
construct:function
construct:subtract
construct:variable
construct:visibility-modifiers
paradigm:functional
technique:higher-order-functions
uses:IntInf
uses:div
uses:let
uses:local
uses:open
ErikSchierboom commented 8 months ago

Exercise: collatz-conjecture

Code

(* Once again, README lets us down. *)

fun collatz n =
    let fun loop n k = if n = 1 then k else
                       loop (if n mod 2 = 0 then n div 2 else n*3+1) (k+1)
     in if n <= 0 then NONE else SOME (loop n 0)
    end

Tags:

construct:comment
construct:function
construct:if-then-else
construct:integral-number
construct:mul
construct:optional-type
construct:parameter
construct:recursion
paradigm:functional
technique:recursion
uses:let
ErikSchierboom commented 8 months ago

Exercise: collatz-conjecture

Code

fun collatz (n: int): int option =
  let fun collatzRec n count =
    if n <= 0 then NONE
    else if n = 1 then (SOME count)
    else if n mod 2 = 1 then collatzRec (3 * n + 1) (count + 1)
    else collatzRec (n div 2) (count + 1)
  in
    collatzRec n 0
  end

Tags:

construct:add
construct:if-then-else
construct:integral-number
construct:invocation
construct:local-function
construct:modulo
construct:multiply
construct:parameter
technique:recursion
paradigm:functional
uses:div
uses:let
ErikSchierboom commented 8 months ago

Exercise: sum-of-multiples

Code

fun sum (factors: int list, limit: int): int =
  let fun countMultiples curNum = 
    if curNum >= limit then 0 else
    if List.exists (fn x => curNum mod x = 0) factors then curNum + countMultiples (curNum + 1) else
    countMultiples (curNum + 1)
  in
    countMultiples 1
  end

Tags:

construct:add
construct:function
construct:if-then-else
construct:integral-number
construct:invocation
construct:lambda
construct:list
construct:local-function
construct:parameter
paradigm:functional
technique:recursion
uses:List.exists
uses:let
ErikSchierboom commented 8 months ago

Exercise: phone-number

Code

fun clean (text: string): string option =
    let fun c _ 12 _ = NONE (* stop early if it's obviously invalid *)
          | c (h::t) n a =
            if Char.isDigit h
            then c t (n+1) (h::a)
            else c t n a
          | c [] n a = SOME (List.rev a, n)
        fun v (#"1"::_) = NONE
          | v (#"0"::_) = NONE
          | v (_::_::_::(#"1"::_)) = NONE
          | v (_::_::_::(#"0"::_)) = NONE
          | v s = SOME (String.implode s)
    in case c (String.explode text) 0 [] of
         SOME (#"1"::s, 11) => v s
       | SOME (s, 10) => v s
       | _ => NONE
    end

Tags:

construct:case
construct:char
construct:comment
construct:function
construct:if-then-else
construct:integral-number
construct:invocation
construct:parameter
construct:pattern-matching
construct:string
construct:tuple
construct:underscore
paradigm:functional
technique:recursion
uses:Char.isDigit
uses:String.explode
uses:String.implode
uses:let
ErikSchierboom commented 8 months ago

Exercise: roman-numerals

Code

fun romanDigit 9 x v u = [u, x]
  | romanDigit 8 _ v u = [v, u, u, u]
  | romanDigit 7 _ v u = [v, u, u]
  | romanDigit 6 _ v u = [v, u]
  | romanDigit 5 _ v _ = [v]
  | romanDigit 4 _ v u = [u, v]
  | romanDigit 3 _ _ u = [u, u, u]
  | romanDigit 2 _ _ u = [u, u]
  | romanDigit 1 _ _ u = [u]
  | romanDigit 0 _ _ _ = []
  | romanDigit _ #"?" _ _ = raise Fail "Invalid value (3)"
  | romanDigit _ _ #"?" _ = raise Fail "Invalid value (2)"
  | romanDigit _ _ _ #"?" = raise Fail "Invalid value (1)"
  | romanDigit _ _    _ _ = raise Fail "Invalid value"

fun digit number position = (number div position) mod 10

fun roman (number: int): string =
  concat (
    map implode [
      (romanDigit (digit number 1000) #"?" #"?" #"M"),
      (romanDigit (digit number  100) #"M" #"D" #"C"),
      (romanDigit (digit number   10) #"C" #"L" #"X"),
      (romanDigit (digit number    1) #"X" #"V" #"I")
    ]
  )

Tags:

construct:char
construct:function
construct:integral-number
construct:invocation
construct:list
construct:modulus
construct:parameter
construct:pattern-matching
construct:string
construct:underscore
paradigm:functional
technique:exceptions
uses:List
uses:map
uses:implode
ErikSchierboom commented 8 months ago

Exercise: pig-latin

Code

local
   open Char

   fun isConsonant c = c <> #"a" andalso c <> #"e" andalso
                       c <> #"i" andalso c <> #"o" andalso c <> #"u"

   fun split (#"y"::x::xs)     ys = if isConsonant x then (#"y"::x::xs, ys)
                                    else split (x::xs) (#"y"::ys)
     | split (#"y"::[])        ys = (#"y"::[], ys)
     | split (#"x"::x::xs )    ys = if isConsonant x then (#"x"::x::xs, ys)
                                    else split (x::xs) (#"x"::ys)
     | split (#"q":: #"u"::xs) ys = (xs, #"u":: #"q"::ys)
     | split (x::xs)           ys = if isConsonant x then split xs (x::ys)
                                    else (x::xs, ys)
     | split []                ys =      ([],    ys)

   fun words string =
       let fun loop (c::cs) w f = if isAlpha c then
                                     loop cs (toLower c :: w) f
                                  else
                                     loop cs [] (if null w then f
                                                 else rev w :: f)
             | loop [] w f = rev (if null w then f else rev w :: f)
        in loop (explode string) [] []       
       end

   val unwords = String.concatWith " " o map implode

   fun pig w =
       case split w [] of (back, front) => back @ rev front @ [#"a",#"y"]

 in val translate = unwords o map pig o words
end

Tags:

construct:case
construct:char
construct:function
construct:if-then-else
construct:invocation
construct:pattern-matching
construct:string
construct:variable
construct:visibility-modifiers
technique:recursion
paradigm:functional
paradigm:higher-order-functions
uses:Char
uses:andalso
uses:open
uses:local
uses:let
uses:String.concatWith
uses:map
uses:implode
uses:andalso
ErikSchierboom commented 8 months ago

Exercise: matching-brackets

Code

val isDelim = Char.contains "()[]{}"
val filterDelim = List.filter isDelim

val isOpenDelim = Char.contains "([{"
fun closingOf #"(" = #")"
  | closingOf #"[" = #"]"
  | closingOf #"{" = #"}"
  | closingOf delim =
      raise Fail ("Unknown delimiter '" ^ Char.toString delim ^ "'.")

fun inner [] delims = List.null delims
  | inner (c::str) [] = inner str [c]
  | inner (c::str) (d::ds) =
    if isOpenDelim(d) andalso c = closingOf(d) then inner str ds
    else inner str (c::d::ds)

fun isBalanced str = inner (filterDelim (String.explode str)) []

Tags:

construct:char
construct:function
construct:if-then-else
construct:invocation
construct:list
construct:pattern-matching
construct:raise
construct:string
construct:variable
paradigm:functional
technique:boolean-logic
technique:exceptions
technique:higher-order-functions
uses:Char
uses:List
uses:andalso
ErikSchierboom commented 8 months ago

Exercise: matching-brackets

Code

(* conditional string-based fold function *)
datatype ('a, 'b) continue_or_stop = Continue of 'a | Stop of 'b

fun fold_until e0 f_finish f_acc s =
  let val len = String.size s
      fun go i e =
        if i < len
        then let val c = String.sub (s, i)
             in case f_acc (c, e) of
                     Continue e' => go (i+1) e'
                   | Stop res => res
             end
        else f_finish e
  in go 0 e0 end

val isBalanced =
  fold_until [] null
    (fn (c, stack) => case (stack, c) of
        (_, #"[") => Continue (c::stack)
      | (_, #"{") => Continue (c::stack)
      | (_, #"(") => Continue (c::stack)

      | (#"["::stack, #"]") => Continue stack
      | (#"{"::stack, #"}") => Continue stack
      | (#"("::stack, #")") => Continue stack

      | (_, #"]") => Stop false
      | (_, #"}") => Stop false
      | (_, #")") => Stop false
      | _         => Continue stack)

Tags:

construct:add
construct:case
construct:char
construct:function
construct:if
construct:invocation
construct:integral-number
construct:lambda
construct:list
construct:parameter
construct:pattern-matching
construct:string
construct:tuple
construct:underscore
construct:variable
paradigm:functional
technique:recursion
uses:List
uses:datatype
uses:let
uses:String.sub
ErikSchierboom commented 8 months ago

Exercise: matching-brackets

Code

fun isBalanced s =
    let fun match c (x::_) = c = (x : char)
          | match _ []     = false
        fun check (#"(" :: cs) bs = check cs (#"(" :: bs)
          | check (#"[" :: cs) bs = check cs (#"[" :: bs)
          | check (#"{" :: cs) bs = check cs (#"{" :: bs)
          | check (#"}" :: cs) bs = match #"{" bs andalso check cs (tl bs)
          | check (#"]" :: cs) bs = match #"[" bs andalso check cs (tl bs)
          | check (#")" :: cs) bs = match #"(" bs andalso check cs (tl bs)
          | check (_    :: cs) bs = check cs bs
          | check []           bs = null bs
     in check (explode s) []
    end

Tags:

construct:char
construct:char-literal
construct:function
construct:invocation
construct:list
construct:parameter
construct:pattern-matching
construct:string
construct:underscore
paradigm:functional
technique:boolean-logic
technique:recursion
uses:andalso
uses:let
rainij commented 8 months ago

@ErikSchierboom I try to help but I would mainly focus on removing completely incorrect tags. I would edit your comments under the "Tags:" to do so. I am busy right now with other stuff so if I haven't done anything until friday/saturday midnight you probably have to ask somebody else.

If somebody else feels like doing this: please drop a comment so that I know that. I wouldn't be sad if I had not to do this ;).

rainij commented 8 months ago

@ErikSchierboom I edited all the 20 comments. I am probably not very good at this but I think the tags should be better than before (though far from perfect I guess).

I didn't check if the solutions actually work (although I noticed that one "solution" was just the template, search for NOTE: in this page).

I mainly removed wrong tags, or moved them to a different category (like construct:andalso -> uses:andalso). In a few cases I actually added some tags but really rarely. Sometimes I checked if the tags (other then uses:...) actually exist, but not always.

ErikSchierboom commented 7 months ago

This is an automated comment

Hello :wave: Next week we're going to start using the tagging work people are doing on these. If you've already completed the work, thank you! If you've not, but intend to this week, that's great! If you're not going to get round to doing it, and you've not yet posted a comment letting us know, could you please do so, so that we can find other people to do it. Thanks!

ErikSchierboom commented 7 months ago

Thanks for the help! We've updated the tags.