Ambiguity in Fisher-Yates pseudocode in lab w07_shuffle

snctfd commented 3 years ago

The pseudocode for the Fisher-Yates shuffle algorithm as written in the compendium specifies the following: r <- slumptal mellan 0 och i. This has led to several students being confused about why their implementations don't give an even shuffle, because they interpreted the range as 0 <= r < i. I would argue that r <- slumptal så att 0 <= r <= i would be a more clear and precise formulation of the sentence.

There is of course some learning value in debugging why their algorithm doesn't work properly and checking their assumptions, but I would argue that ambiguously defined pseudocode provides more frustration than it does opportunity for learning.

fritjof-b commented 3 years ago

I agree to some extent. However, like mentioned on Discord it's a nice lesson to "think" about what you're doing and why something is failing. We could explicitly point towards the Wikipedia page and say something like "Read section x before you continue".

There is a risk in providing steps that are too explicit in that students just "do it". Like with a lot of the exercises, as discussed during this week's meeting, "if the output is the same, I'm finished", without further contemplation of the problem at hand.

This is true especially for algorithms and data structures (and a large part of computer science). One needs to actually think what is happening, and not just produce solutions aimlessly without knowing what is going

These shortcomings in terms of knowledge and understanding especially showed and shows during last week's lab blockbattle and this week's shuffle. Some students still have a hard time grasping concepts such as swapping two variables, i.e.:

var x = 3
var y = 5
var tmp = x
x =  y
y = tmp

tl;dr: I agree that assignments and exercises should be clear. But by being too clear they might hinder learning and further contemplation by the student.

bjornregnell commented 3 years ago

How about somehow hinting in the instructions that the pseuod-code is ambiguous and asking them to check the wikipedeia page: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle Or is it too much to read and thereby risking to burn too much of students' grit?

fritjof-b commented 3 years ago

We could point to https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm

Which is not that much too read but should enlighten them about any uncertainties.

bjornregnell commented 3 years ago

Yes, that's an option @fritjof-b @snctfd What do you think about linking to the relevant section on wikipedia and say that they need to think carefully about the interval or something?

madeleineberild commented 3 years ago

I think this is a good solution, it solves both the problem and encourages thinking. Learning that sometimes you need to look elsewhere for information is also a good addition.

snctfd commented 3 years ago

I agree to some extent. However, like mentioned on Discord it's a nice lesson to "think" about what you're doing and why something is failing. We could explicitly point towards the Wikipedia page and say something like "Read section x before you continue".

There is a risk in providing steps that are too explicit in that students just "do it". Like with a lot of the exercises, as discussed during this week's meeting, "if the output is the same, I'm finished", without further contemplation of the problem at hand.

This is true especially for algorithms and data structures (and a large part of computer science). One needs to actually think what is happening, and not just produce solutions aimlessly without knowing what is going wrong

I do understand this sentiment and appreciate the value of being able to mentally evaluate your algorithm and look for errors. That said, I feel like reasoning your way into why the range 0 <= r < i is wrong seems quite difficult, and I believe there's a risk of it becoming a game of "spot the obscure bug" which is just frustrating. I mean, as a lab instructor I can't give a simple motivation off the top off my head for why you need to include i in the range - so it seems unreasonable to expect the students to figure that out by themselves.

To simplify my argument, I don't feel like there's enough of an "aha moment" when you solve this issue to justify making the lab instructions intentionally vague; the average student will just go "oh, my range was wrong" without actually gaining any level of deeper understanding of the algorithm itself.

@snctfd What do you think about linking to the relevant section on wikipedia and say that they need to think carefully about the interval or something?

Highlighting that the range is something to be careful about could help students avoid this, but we already link to the Wikipedia page which does specify 0 <= i <= j and I see no good reason why leaving the compendium in its ambiguous state would be more educational, other than possibly being a lesson about challenging your assumptions and looking for more information when implementing something that could be interpreted in one of several ways.

snctfd commented 3 years ago

Another possible alternative would be to explain in the compendium why the range is what it is, but that may just be glossed over by students or just feel generally out of place.

bjornregnell commented 3 years ago

Point taken @snctfd Yes it is by no means obvious why to include the i in the range instead of going until but not including i, you'd have to draw pictures of examples and think hard and that's why the algorithm is on wikipedia in the first place and named after algorithm-gurus :) Also there is the problem of testability, in that you need to test the statistical properties of your output to discover bugs of this kind...

So two options: 1) Just fix the ambiguity and write <= explicitly 2) Keep the ambiguity and link to wikipedia

I'm leaning towards 1, but let's wait some days and see if more people chip in...

bjornregnell commented 3 years ago

I'm sending a tip in our discord with a hint on this for this year's students who has the algorithm on print...

lunduniversity / introprog

Ambiguity in Fisher-Yates pseudocode in lab w07_shuffle #609