Exercise difficulty distribution

coriolinus commented 6 years ago

$ jq '.exercises[] | if (.deprecated | not) then .difficulty else empty end' config.json | sort -n | uniq -c
     18 1
     47 4
      6 7
      8 10

The problem is that, in my subjective opinion, we overload students with easy and/or trivial exercises before allowing them to proceed into more interesting problems. The currently linear nature of exercism doesn't help with that. I'm aware that future versions are intended to help solve this problem, but I am not aware of any planned date on which those versions are intended to be released, so I am disinclined to simply wait.

Question 0: Is this an actual problem?

I believe it to be one, but it's well worth getting other input on this. If people find that the exercise difficulty scales well with their increasing mastery of the language, then no further action is required for this issue.

All further questions assume that this is answered in the affirmative.

Question 1: Should we cull some difficulty 1 and/or some difficulty 4 exercises?

This is the simplest way to improve the difficulty balance: we identify some subset of difficulty 1 and 4 exercises which are most pedagogically useful and most interesting, and drop the rest. We don't have to delete any code; simply adding a deprecated tag in config.json should be sufficient to exclude it from the track, so we can easily reintegrate it as an optional exercise once that's a thing.

If we answer this question in the affirmative, there are a few sub-questions which should be talked over:

Question 1.1: How many level 1 exercises are desired?
Question 1.2: How precisely should we rank the level 1 exercises?
Question 1.3: How many level 4 exercises are desired?
Question 1.4: Do we desire a separate ranking scheme for level 4 exercises? If yes, what should it be?

Question 2: Should we impose barriers to adding new difficulty 1 or difficulty 4 exercises?

This could be a flat ban on future easy exercises, or a requirement that each proposed addition include the removal of another of the same difficulty, or something else.

This addresses the problem at its source: students get excited about Rust, enjoy exercism, and want to contribute an exercise. They pick something easy, because they are not confident in their own ability to contribute a more difficult exercise.

This is potentially a bad idea for the same reason: if students are discouraged from submitting easy exercises, are we sure they'd come back later to submit difficult ones?

If they don't, is that even an issue? Rust currently has 79 active exercises; there are 120 in the problem specifications repo. Within problem specifications, there's no difficulty data, so it's hard to estimate what the difficulty distribution for the unimplemented exercises is. If the current Rust track comprises a fair sample of the problems specified, then its difficulty distribution should be pretty close to ours. In that case, it's skewed in favor of easier problems.

Question 3: Should we offer incentives to add new difficulty 7 or difficulty 10 exercises?

If we do, what exactly could we offer more than a note in the README stating that we are particularly seeking those exercises?

petertseng commented 6 years ago

Question 0: Is this [uneven difficulty distribution] an actual problem? Question 1: Should we cull some difficulty 1 and/or some difficulty 4 exercises?

Background: my mode of interacting with a given track as a student is to list the exercises in a track using exercism ls. This list does not include difficulties. I then only complete the ones that I deem interesting.

Result of background: I deem myself incompetent to answer the above questions.

Question 2: Should we impose barriers to adding new difficulty 1 or difficulty 4 exercises?

While I tend not to object to adding more choice by adding a exercise E, I also understand that this increases the slog length for those who do not choose to skip exercise E, whether by choice or because of not knowing.

I offer that education ("just fetch whatever exercises you want, no need to stick to our order") would work, but I also understand that education is only effective if it can reach its target audience, whereas making the track do the right thing by default benefits more people without effort on the part of the beneficiaries.

In v1 deprecated exercises can still be fetched by name and solutions for them submitted. Thus I will not personally be affected negatively if large swathes of exercises are deprecated, and I do not complain if this course of action is chosen.

Question 3: Should we offer incentives to add new difficulty 7 or difficulty 10 exercises?

One may have considered asking the author of https://tinyletter.com/exercism/archive to include shoutouts to those who contribute such exercises. But I would never find that incentive enough to motivate me to add an exercise. As I'm sure all understand, it is difficult to find worthy incentives to volunteer efforts.

One may consider reviewing the exercises in problem-specifications, determining which ones we would estimate to be 7 or 10, and creating a Rust track issue "Implement Exercise E" for each such exercise. No such issue would be created for other exercises. So the incentive would be "hey I get to close an issue". I personally would not really find that to be a meaningful incentive, but at least it would be a useful indicator of what exercises the track maintainers would really like to see.

coriolinus commented 6 years ago

One may consider reviewing the exercises in problem-specifications, determining which ones we would estimate to be 7 or 10, and creating a Rust track issue "Implement Exercise E" for each such exercise. No such issue would be created for other exercises.

If we choose not to perform a cull--which absent strong agreement that it's better for the students, we probably shouldn't do--then opening issues for more difficult exercises might be the only realistic way forward.

coriolinus commented 6 years ago

To help compare the difficulty distributions, I got some of the other popular tracks' config.json files and ran the same script:

python

$  jq '.exercises[] | if (.deprecated | not) then .difficulty else empty end' python-config.json | sort -n | uniq -c
      74 1
      2 2
      7 3
      7 4
      6 5
      3 6
      3 7
      2 8
      1 9

go

$ jq '.exercises[] | if (.deprecated | not) then .difficulty else empty end' go-config.json | sort -n | uniq -c
     10 1
     14 2
     28 3
     23 4
     20 5
      3 6
      4 7
      2 8
      2 9

javascript

$ jq '.exercises[] | if (.deprecated | not) then .difficulty else empty end' js-config.json | sort -n | uniq -c
     11 1
     11 2
      9 3
     14 4
     17 5
     11 6
     10 7
      8 8
      1 9

analysis

The python track has it bad with 74 difficulty 1 issues, and then a relatively even, low distribution of the rest of the difficulties.

Go's pattern is interesting in that it's bimodal: both the problems of difficulty <= 5 and the problems of difficulty > 5 have relatively even distributions, but the quantities in the former are much higher than those in the latter.

Javascript's track appears to have somehow fostered a nearly even distribution of exercise difficulties, with the exception of difficulty 9.

Rust is unique in possessing difficulty 10 exercises. Rust is unique in compressing the domain of difficulties into a 1, 4, 7, 10 scheme. Rust is unique in having more exercises in the top quartile of difficulty than in the third quartile of difficulty.

No attempt has been made to normalize the intertrack difficulty ranking rubrics. Future work might be to look at the hardest quartile of each of those tracks and compare the exercises to rust's, to look for potential future exercises to add.

workingjubilee commented 5 years ago

Does difficulty affect when exercises unlock?

The Rust track is a little more slow moving due to mentorship : student ratio, and I understand that (I will probably, if I can, try to find time to mentor it... but I would like to have a run through of the main course first to see how it goes, before trying? :sweat_drops: ) but I also would be interested if more exercises unlocked at every step. Is difficulty related to that at all?

coriolinus commented 5 years ago

Each core exercise unlocks some number of leaf exercises. The data is all in config.json, but it can be hard to visualize. Here's one way to look at it:

Tue Sep  3 10:46:00 CEST 2019
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> with open('config.json') as fp:
...     config = json.load(fp)
...
>>> exercises = config['exercises']
>>> cores = [e for e in exercises if e['core']]
>>> def exrepr(e):
...     return f"{e['slug']} ({e['difficulty']})"
...
>>> for core in cores:
...     print(exrepr(core))
...     for e in exercises:
...             if e['unlocked_by'] == core['slug']:
...                     print("  ", exrepr(e))
...
hello-world (1)
   leap (1)
   raindrops (1)
   nth-prime (1)
   beer-song (1)
   proverb (1)
   difference-of-squares (1)
   sum-of-multiples (1)
   grains (1)
   prime-factors (1)
   armstrong-numbers (1)
reverse-string (1)
gigasecond (1)
bob (1)
   matching-brackets (1)
clock (4)
   dot-dsl (4)
   simple-linked-list (4)
   pascals-triangle (4)
   paasio (4)
   nucleotide-count (4)
   etl (4)
   acronym (4)
   sieve (4)
   rna-transcription (4)
   triangle (4)
   grade-school (4)
   binary-search (4)
   robot-simulator (7)
   queen-attack (4)
   bowling (4)
   tournament (4)
   alphametics (4)
   two-bucket (4)
   spiral-matrix (4)
   palindrome-products (4)
   saddle-points (4)
   isogram (4)
   say (4)
   run-length-encoding (4)
   isbn-verifier (4)
   perfect-numbers (4)
   hamming (4)
   scrabble-score (4)
   pangram (4)
   all-your-base (4)
   allergies (4)
   variable-length-quantity (4)
   pig-latin (4)
atbash-cipher (4)
   crypto-square (4)
   rotational-cipher (4)
   simple-cipher (4)
   rail-fence-cipher (4)
anagram (4)
   protein-translation (7)
   robot-name (4)
   ocr-numbers (10)
   react (10)
space-age (7)
   wordy (4)
sublist (7)
   custom-set (4)
minesweeper (7)
   rectangles (10)
   circular-buffer (10)
luhn (7)
   luhn-from (4)
   luhn-trait (4)
   largest-series-product (4)
   word-count (4)
   phone-number (4)
   diamond (4)
   accumulate (4)
   fizzy (7)
   roman-numerals (4)
   pythagorean-triplet (7)
   series (1)
   collatz-conjecture (1)
   diffie-hellman (1)
parallel-letter-frequency (10)
macros (10)
poker (10)
   grep (7)
   scale-generator (7)
   decimal (7)
   book-store (7)
   dominoes (10)
forth (10)
   doubly-linked-list (10)

coriolinus commented 4 years ago

Given v3, these all become practice exercises unlocked topologically by concept. Given that, Question 0 now has a definitive answer: no, this isn't a problem.

exercism / rust