Open lihaoyi opened 4 years ago
The merge sort implementation is not tail recursive, correct? if yes, Is it okay to implement this way?
@gfn9cho it's not, since it has to recurse twice and perform some computations after the recursion, while tail-recursion requires recursing only once and returning the result of recursion immediately with no further computation.
In general, most recursive things are not tail recursive, and that's OK. If you start hitting stack size limits due to deep recursion, you can pass in -Xss
to the JVM to give it a bigger stack, or refactor a recursive algorithm to an imperative algorithm with a loop and ArrayDeque
.
@lihaoyi I'm using the code provided in solution 6.8 - DepthSearchPaths. Using the same directed graph, just different representations (changed order in Seq), I'm getting two different results for the shortest path from node a to node e.
Given these two representations of the same graph, I have different results.
start = "a"
dest = "e"
graph = Map(
"a" -> Seq("d", "c"),
"b" -> Seq("d"),
"d" -> Seq("f"),
"f" -> Seq("e"),
"c"->Seq("e"), "e"->Seq())
)
start = "a"
dest = "e"
graph = Map(
"a" -> Seq("c", "d"), #changed order
"b" -> Seq("d"),
"d" -> Seq("f"),
"f" -> Seq("e"),
"c"->Seq("e"), "e"->Seq())
)
Convert to no vars, recursion only. Good for learning to use options. Can be used to make the immutable Trie solution completely immutable (no vars anywhere), even outside the Immutable Trie Node constructor.
def contains(s: String): Boolean = {
def loop(xc: List[Char], optnd: Option[Node]): Boolean =
if (optnd.nonEmpty) {
xc match {
case Nil => if (optnd.exists(_.hasValue)) true else false
case head :: tail => loop(tail, optnd.get.children.get(head))
}
} else false
loop(s.toList, Option(root))
}
def prefixesMatchingString(s: String): Set[String] = {
val prefixSet = Set.newBuilder[String]
def loop(xs: List[Char], r: String, optnd: Option[Node]): Unit =
if (optnd.nonEmpty) {
if (optnd.get.hasValue) prefixSet += r
if (xs.nonEmpty) loop(xs.tail, r + xs.head, optnd.get.children.get(xs.head))
}
loop(s.toList, "", Option(root))
prefixSet.result
}
def stringsMatchingPrefix(s: String): Set[String] = {
val stringSet = Set.newBuilder[String]
def loop1(xs: List[Char], optnd: Option[Node]): Option[Node] = optnd match {
case None => None
case Some(nd) => if (xs.isEmpty) Some(nd) else loop1(xs.tail, nd.children.get(xs.head))
}
def loop2(nd: Node, r: String): Unit = {
if (nd.hasValue) stringSet += r
for ((c, n) <- nd.children) loop2(n, r + c)
}
loop1(s.toList, Option(root)) match {
case None => Set()
case Some(nd) =>
loop2(nd, s)
stringSet.result
}
}
Note: The BinarySearch tests involving Vector("i", "am", "cow", "hear", "me", "moo") are invalid as the Seq is not sorted Below is a minor variation that returns the index of the first matching element
// Implement a simple binary search algorithm in Scala
// To find an element within a sorted Array in O(log n) time
// Assumption: IndexedSeq[T] is sorted in Ascending order
// $ amm BindexOf.sc
def bindexOf[T: Ordering](sorted: IndexedSeq[T], target: T): Int = {
def loop(left: Int, center: Int, right: Int): Int = {
if (left > right) -1 // target not found or IndexedSeq not sorted
else
Ordering[T].compare(target, sorted(center)) match {
case 0 => center
case r if r > 0 => loop(center + 1, (center + 1 + right) / 2, right)
case _ => loop(left, (left + center - 1) / 2, center - 1)
}
}
loop(0, (sorted.length - 1) / 2, sorted.length - 1)
}
assert(pprint.log(bindexOf(Array[Int](), 3)) == -1)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 1)) == 0)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 9)) == 3)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 7)) == 2)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 13)) == 4)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 8)) == -1)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 2)) == -1)
assert(pprint.log(bindexOf(Array(1, 3, 7, 9, 13), 100)) == -1)
assert(pprint.log(bindexOf(Vector("am", "cow", "hear", "me", "moo"), "am")) == 0)
assert(pprint.log(bindexOf(Vector("am", "cow", "hear", "me", "moo"), "cow")) == 1)
assert(pprint.log(bindexOf(Vector("am", "cow", "hear", "me", "moo"), "hear")) == 2)
assert(pprint.log(bindexOf(Vector("am", "cow", "hear", "me", "moo"), "moo")) == 4)
assert(pprint.log(bindexOf(Vector("am", "cow", "hear", "me", "moo"), "horse")) == -1)
// Is an IndexedSeq really Sorted or not?
// This is O(n) but in some cases a binary approach may exit quicker
// The tryBreakable control structure is used to quit computations once a comparison is out of order
// Question: How can the 2 loops in the last case be parallelized safely?
// See Stackoverflow link below for imperative, non-generic approaches
// https://stackoverflow.com/questions/11989071/fastest-way-to-check-if-an-array-is-sorted
def isSeqSorted[T: Ordering](xs: IndexedSeq[T]): Boolean = {
import scala.util.control.Breaks._
tryBreakable {
def loop(a: Int, b: Int): Boolean = {
(Ordering[T].compare(xs(a), xs(b)) <= 0) match {
case true if (b - a) <= 1 => true
case false => break
case true => loop(a, (a + b) / 2) && loop((a + b) / 2, b)
}
}
if (xs.length <= 1) true else loop(0, xs.length - 1)
} catchBreak false
}
assert(pprint.log(isSeqSorted(Vector[Int]())) == true)
assert(pprint.log(isSeqSorted(Vector(1))) == true)
assert(pprint.log(isSeqSorted(Vector(1, 2))) == true)
assert(pprint.log(isSeqSorted(Vector(1, 3, 2))) == false)
assert(pprint.log(isSeqSorted(Vector(2, 1, 2, 3, 4))) == false)
val testDVec = {
(1 to 50).map(_.toDouble).toVector ++
Vector(49.9999) ++ // out of order element in the middle
(51 to 100).map(_.toDouble).toVector
}
assert(pprint.log(isSeqSorted(testDVec)) == false)
assert(pprint.log(isSeqSorted(testDVec.filter(_ != 49.9999))) == true)
In 6.12.scala, for (c <- s if current.nonEmpty) current = current.get.children.get(c)
Does this iterate through the whole string s
, or does the compiler optimize the "extra" cycles away by seeing that current
cannot change once it becomes empty? I guess my question is if this is what we are relying on by implementing it this way?
In stringsMatchingPrefix
it wouldn't be better in terms of performance to use collection.mutable.ListBuffer[Char]
as the type for the parameter path
in the recurse
function?
In that way, we can append an element in constant time avoiding reverse path
before adding it to output
(reverse
complexity for lists is O(n)
).
It is implemented in that way for avoiding the use of mutable collections?
@ilitosh it always iterates through the whole string, though iteration through the subsequent portion would be faster since it doesn't need to do anything each iteration. This could be made more efficient by using while-loop or tail-recursive function with an early exit
@SalviCF I believe ListBuffer
cannot be effectively shared between multiple lists, as we are doing here. We could also use a mutable.ArrayDeque
and make sure we append the character before each recursive step and remove the last character after, which would avoid reversing the list but still necessitate copying. In terms of performance, I haven't measured it but I'm guessing all three solutions (current, listbuffer, arraydeque) wouldn't be too far from each other.
I am looking at the Merge sort example in the chapter 6, specifically generic version of it (6.9.scala
). Since here we are using context bound, how can we use Ordering[T]
directly without using implicitly
keyword?
@milenkara Interestingly, both implicitly(Ordering[T]).lt(...)
and Ordering[T].lt(...)
work.
@lihaoyi in exercise 6.8 - DepthSearchPaths, the shortestPath
method does not find the shortest path.
Reproduction steps:
$ amm --predef Search.sc
Loading...
Welcome to the Ammonite Repl 2.2.0 (Scala 2.13.3 Java 11.0.4)
@ shortestPath(
start = "a",
dest = "e",
graph = Map(
"a" -> Seq("d", "b"),
"b" -> Seq("c"),
"c" -> Seq("e"),
"d" -> Seq("e"),
"e" -> Seq()
)
)
res0: Seq[String] = List("a", "b", "c", "e")
But the expected result would be:
List("a", "d", "e")
I think this unexpected behavior is due to: https://github.com/handsonscala/handsonscala/blob/ebc0367144513fc181281a024f8071a6153be424/examples/6.8%20-%20DepthSearchPaths/Search.sc#L9
Next change should fix it:
- if !seen.contains(next) && !pathLengths.get(next).exists(_ <= pathLength + 1)
+ if !seen.contains(next) || !pathLengths.get(next).exists(_ <= pathLength + 1)
I am looking at the Merge sort example in the chapter 6, specifically generic version of it (
6.9.scala
). Since here we are using context bound, how can we useOrdering[T]
directly without usingimplicitly
keyword?
Here is a lecture that helped me understand type classes better https://www.youtube.com/watch?v=1e9tcymPl7w. It also explains in which situations implicitly
is not needed.
@milenkara Thanks for the link to the presentation. Now I understand it's because of this apply
method in the Ordering
companion object that allows implicitly
to be omitted.
I have implemented the FloodFill on my laptop, but I needed a higher limit than 25, for the color similarity. What worked for me was 100. Any guesses why? I did it on a mac, scala 2.13.3, java 1.8.0_251.
I am trying to test TrieMap.add(), I understand the implementation is to preserve the values in children(collection.mutable.Map[Char, Node]) . But how do I test the values are actually preserved when we call TrieMap.add() for the second iteration?
Tried something like below but it doesn't help with my test, any help here, please
val t = new Trie() t.add("mango") println(t.root.children.get('m')) println(t.root.children.get('g')) println(t.root.hasValue) t.add("mandarin") t.root.children.foreach(x => println (x._1 + "-->" + x._2))
@marin123 haha not sure. This kind of heuristic-based image processing algorithm is always pretty finnicky, but if you can tweak the number to make it look more-or-less right then you're probably fine.
I am working on exercise 6.6 and my implementation is like below
def search[T: Ordering](value: T, list: Seq[T]): Boolean = {
if (list.length == 1) value == list(0)
else {
val midPoint = list.length / 2
val mVal = list(midPoint)
if (mVal == value) true
else if (Ordering[T].lt(mVal, value)) search(value, list.drop(midPoint))
else search(value, list.take(midPoint))
}
}
However, the solution uses a nested function (https://github.com/handsonscala/handsonscala/blob/v1/examples/6.6%20-%20BinarySearch/BinarySearch.sc)
def binarySearch[T: Ordering](sorted: IndexedSeq[T], target: T): Boolean = {
def binarySearch0(start: Int, end: Int): Boolean = {
// If the sequence you are trying to search has no items, the item cannot be found
if (start == end) false
else {
val middle = (start + end) / 2
// Otherwise, take the item at the middle of the sequence
val middleItem = sorted(middle)
// and compare it to the item you are looking for
val comparison = Ordering[T].compare(target, middleItem)
// If the item is what you are looking for, you have found it
if (comparison == 0) true
// Otherwise, if the item is greater than the one you are looking for,
// binary search the left half of the sequence
else if (comparison < 0) binarySearch0(start, middle)
// If it is less than the item you are looking for, binary search the right half
else binarySearch0(middle + 1, end)
}
}
binarySearch0(0, sorted.length)
}
I have seen the nested function in a few examples. I wondered whether it is a scala idiomatic way to use a nested function like the solution
I finished exercise 6.9 and my implementation is like below
def solver(grid: Array[Array[Int]]): Unit = {
def nextEmpty(grid: Array[Array[Int]]): (Int, Int) = {
for (row <- Range.inclusive(0, 8)) {
for (col <- Range.inclusive(0, 8)) {
if (grid(row)(col) == 0) return (row, col)
}
}
(-1, -1)
}
def solver0(grid: Array[Array[Int]]): Boolean = {
val (row, col) = nextEmpty(grid)
if (row == -1) return true
for (n <- Range.inclusive(1, 9)) {
grid(row)(col) = n
if (isValidSudoku(grid) && solver0(grid)) return true
else grid(row)(col) = 0 // inside the for loop
}
return false
}
if (solver0(grid)) pprint(grid)
else pprint(Array(Array(0)))
}
The implement is similar to the solution on https://github.com/handsonscala/handsonscala/blob/v1/examples/6.9%20-%20Sudoku/Sudoku.sc except I put backtrack inside the for loop
for (n <- Range.inclusive(1, 9)) {
grid(row)(col) = n
if (isValidSudoku(grid) && solver0(grid)) return true
else grid(row)(col) = 0
}
While the solution put it outside the for loop
for (value <- Range.inclusive(1, 9)) {
cells(i)(j) = value
if (isValidSudoku(cells) && solve(i + 1, j, cells)) return true
}
cells(i)(j) = 0 // reset on backtrack
Why does the solution put the backtrack outside the for loop?
@cdcd88 both ways work. In general, for any recursive function, you have a choice of putting your "base case"/"backtracking"/"termination" either just before calling the function again ("inside the loop") or just after ("outside the loop").
Which one you choose is generally just a matter of preference, though you should try to be consistent so you don't confuse yourself.
@cdcd88 nesting functions is fine, they're basically functions which are private to other functions, rather than private to classes or packages. Like any other nesting, just try to avoid nesting too deeply. If you end up needing to call the nested function elsewhere, you can just move it out so others can call it
@SalviCF
Using the patch in https://github.com/handsonscala/handsonscala/issues/6#issuecomment-674555756, I get
List("a", "c", "e", "d")
for this graph
assert(
pprint.log(
shortestPath("a", "d",
graph = Map(
"a" -> Seq("b", "c"),
"b" -> Seq("c", "d"),
"c" -> Seq("e"),
"e" -> Seq("d"),
"d" -> Seq()))
) == List("a", "b", "d"))
This works on the above graph, yours and the two in the original test:
- if !seen.contains(next) && !pathLengths.get(next).exists(_ <= pathLength + 1)
+ if !seen.contains(next) || pathLengths(next) > (pathLength + 1)
Looking at the DFS solution provided I think it is actually BFS. By using the append it puts the new nodes at the end and they get popped after everything in the queue already. I think changing it to prepend will correct this.
Thread to discuss anything and everything about Chapter 6. Important topics will be folded into the issue description.