exercism / discussions

For discussing things like future features, roadmap, priorities, and other things that are not directly action-oriented (yet).
37 stars 5 forks source link

Problem ordering, topics covered, and understanding of concepts required #60

Closed kytrinyx closed 7 years ago

kytrinyx commented 8 years ago

For the past three years, the ordering of exercises has been done based on gut feelings and wild guesses.

Over time this has proven to work OK-ish, but it's not great. There are easy exercises that get placed too far back, exercises that are too similar to one another, and exercises that are too difficult that end up early in the track, and we mostly don't notice until someone says out loud that they're struggling with something.

@IanWhitney did a thorough analysis of the rust track in https://github.com/exercism/xrust/issues/127, which resulted in reordering the exercises. He wrote about the experience in an essay called Exercism Shouldn't Make You Cry

We're also talking about similar issues in F# (https://github.com/exercism/xfsharp/issues/133 - @ErikSchierboom) and Elixir (https://github.com/exercism/xelixir/issues/190 - @devonestes), and I'm sure the topic has been mentioned elsewhere and I've missed it.

Back in the day, Peter Minten talked about how we might classify exercises more systematically, in https://github.com/exercism/x-common/issues/63 and https://github.com/exercism/x-common/issues/72.

I think that if we have language-specific classifications and topics, we should do it in the language-specific repository (keeping all the language-specific stuff together).

What if we did this in config.json? We could have a new key, exercises, which contained an array of objects with the problem slug and topics. x-api could be changed to look preferentially at the new key, and fall back to the old one if it's missing. Then we could migrate all the tracks without having to do everything all at once.

The topics would be optional, but having it in the actual codebase would probably help crowdsource this data.

@exercism/track-maintainers Thoughts?

markijbema commented 8 years ago

I think it's a great idea. I think it would be worthwhile to think about a way to make it non-optional though. If you don't, you run the risk of making it like comments in code. If you expose it to users you have to get it right; you could even do this like saying your skill level for a certain thing is at a certain level. Of course, the flipside there is that you then have to handle perception of users.

Conceptually I think it's an interesting question as well whether you should actually enforce a path, or let the user choose to follow a path of their own, with possible other routes (like for instance DuoLingo does). I think the first few exercises you want to let people adhere to a specific order, but after that, it matters less I think.

But all this, if at all interesting, does not need to be the first version of course :) (also: disclaimer, I haven't been on exercism for quite a long time)

ErikSchierboom commented 8 years ago

I for one think this is a great idea! As a track maintainer, I often struggle with the relative ordering of exercises. Having a consistent way to classify and categorize exercises would greatly help with that. I would suggest that for each exercise, we add the following information:

  1. Topics: a list of the topics covered by the exercise. This includes things like control flow, recursion, parsing, etc. Maybe we can come up with a list of default topics that cover 90% of the topics, and let the rest be language-specific.
  2. Difficulty: although the topics can help in judging the difficulty of an exercise (parsing is harder than basic control flow, more topics are harder than fewer topics), I think we should also be able to explicitly specify the difficulty. The main reason for this is that although exercises might have similar topics, they can be of vastly different difficulty.

As @choiaa suggested in this issue, we might even be able to automatically update the order of the exercises based on the topics and difficulty.

In the discussion started by @pminten, he suggests to have two also categorize the exercises in either:

Although I see what he's trying to achieve, I think the same can be accomplished using the suggested difficulty classification.

He also suggested that we specify the topics in the exercise data in the x-common repository, but I think that would not be very convenient and the same exercises might have completely different topics in different languages.

I am also favor of the suggestion of @markijbema that this information should be required in the new format. His point about having a default path of being able to choose a path is also a good one, but I think that's a different discussion.

Overall, I think it's a great idea and I'll gladly help.

devonestes commented 8 years ago

I think it's a great idea, too, but I don't yet know how we'd solve the issue with different exercises being different levels of difficulty in different languages.

For example, the anagram exercise is on the easier side in Elixir or Ruby, but apparently in Rust it's very difficult. At the very least, maybe in the config.json when we list the exercises, we could also list a "relative difficulty" for that exercise as it's implemented in a given language? That way when new exercises are added then the person adding that exercise will have at least a general idea about where to slide it in. Maybe something like this?:

"problems": [
  {
    "_difficulty": 1,
    "name": "hello-world"
  },
  {
    "_difficulty": 3,
    "name": "anagram"
  },
  {
    "_difficulty": 10,
    "name": "forth"
  }
]

Maybe to get better measures of difficulty we could get input from users after they submit an iteration on how hard they felt the problem was in a given language, and that data could be exposed somehow for maintainers to use? That way they'd have some more concrete information on which to base the ordering of exercises.

masters3d commented 8 years ago

I don't have time to dive deep but I am envisioning a solution that is similar to operator precedence. Perhaps we can group each problem and then just worry about the grouping or "concept" ordering and not specifically about each problem order.

kytrinyx commented 8 years ago

I think it's a great idea, too, but I don't yet know how we'd solve the issue with different exercises being different levels of difficulty in different languages.

With the suggested approach, each language track's classification would be independent of other languages. So anagram in Ruby could be classified as easy, and in Rust it could be difficult.

ErikSchierboom commented 8 years ago

And I think that is the only way it will ever make sense. I'm really excited about this possible change!

kotp commented 8 years ago

Is there a list of (generic) programming concepts that could be used and weighted? The list being shared (mostly common) between tracks, but the weights given to the concept specific to those tracks?

ErikSchierboom commented 7 years ago

@kotp I don't know if there is such a list, but we could also try to gather it manually. Here is a quick attempt, in which I have tried to group related concepts:

Basic concepts

Data types

Problem areas

This is by no means a definitive list, it's just something I compiled myself. Maybe we can use this as a starting point for our discussion?

kotp commented 7 years ago

Definitely like the list. I had started one yesterday, but did not get far.

rbasso commented 7 years ago

The general programming concepts are never so general as they appear. :smile:

I think It's nice to have a list of suggested categories, unless the tracks are forced to fit in them.

Maybe it would be more interesting to leave the topics completely open and see what would emerge from the tracks...

markijbema commented 7 years ago

I think the topics will need to differ per track as well, at least, you don't want to restrict them for all languages.

ErikSchierboom commented 7 years ago

@rbasso I don't think we should leave the topics completely open, that will probably lead to a lot of unwanted duplication where similar concepts are named differently in different tracks. We would then have to do cleanup later.

@markijbema The topics will definitely differ per track, but as I said, this list can be something of a starting list. It is not meant to be comprehensive, just helpful :)

rbasso commented 7 years ago

You convinced me, @ErikSchierboom. :smile:

I don't think we should leave the topics completely open, that will probably lead to a lot of unwanted duplication where similar concepts are named differently in different tracks. We would then have to do cleanup later.

ErikSchierboom commented 7 years ago

What scale should we use to grade the difficulty? 1 to 10? 1 to 3? Or text: easy, average, hard?

kotp commented 7 years ago

I think it is easier to consider difficulties in grades of 10 or even 0 to 9. Something that is built in and primitive in the language may be 0 or 1, depending on the conceptual difficulty. It also makes it simple to add the grades to make a weight for for an individual exercise based on adding those values.

The report could be easy, average or hard, derived from the numbers. There could be an additional weight that doesn't correspond from the categories directly, but from the reported response from the users that we get in feedback, that can help to fine tune it to where it is perceived to belong as well, over time.

ErikSchierboom commented 7 years ago

@kytrinyx What is the next step? I think we should decide upon a new format in which the exercises can be described in the config.json file. Maybe something like this:

"problems": [
  {
    "name": "hello-world" ,
    "difficulty": 1,
    "topics": [
        "control-flow (if-statements)",
        "optional values",
        "text formatting"
    ]
  },
  {
    "difficulty": 3,
    "name": "anagram",
    "topics": [
        "strings",
        "filtering"
    ]
  },
  {
    "difficulty": 10,
    "name": "forth",
    "topics": [
        "parsing",
        "transforming",
        "stacks"
    ]
  }
]
kytrinyx commented 7 years ago

Yepp, I think that's the next step.

I'd like to use a new JSON key so that we can leave the old format in place during the migration period (we don't want to leave it in place forever, but for a few weeks to give people time to make the change).

I'm thinking "exercises" makes sense.

Also, how about slug instead of name? We make the distinction in the problems API between the slug (identifier) of the exercise, versus the name, which is a englishified version of the slug (separate words, and capitalized first letter of each part of the name).

I like the difficulty 1-10, which means that we can fine-tune things over time.

ErikSchierboom commented 7 years ago

Let's change it to slug then. I'll start working on mapping topics to exercises for the F# track.

devonestes commented 7 years ago

This all turned out great! I'll work a bit on handling this for Elixir, too.

kytrinyx commented 7 years ago

Excellent! We'll need to write up something that can be submitted to all the tracks. If nobody writes up a suggestion here, I'll tackle that this weekend.

kytrinyx commented 7 years ago

Here's the issue text that I intend to submit to all the tracks. Would you please review this for clarity and correctness?

Subject: Update config.json to match new specification

For the past three years, the ordering of exercises has been done based
on gut feelings and wild guesses. As a result, the progression of the
exercises has been somewhat haphazard.

In the past few months maintainers of several tracks have invested a
great deal of time in analyzing what concepts various exercises require,
and then reordering the tracks as a result of that analysis.

It would be useful to bake this data into the track configuration so
that we can adjust it over time as we learn more about each exercise.

To this end, we've decided to add a new key _exercises_ in the
config.json file, and deprecate the _problems_ key.

See exercism/discussions#60 for details about this decision.

Note that we will **not** be removing the _problems_ key at this time,
as this would break the website and a number of tools.

The process for deprecating the old _problems_ array will be:

* Update all of the track configs to contain the new _exercises_ key,
  with whatever data we have.
* Simultaneously change the website and tools to support both formats.
* Once all of the tracks have added the _exercises_ key, remove support
  for the old key in the site and tools.
* Remove the old key from all of the track configs.

In the new format, each exercise is a JSON object with three properties:

* _slug_: the identifier of the exercise
* _difficulty_: a number from 1 to 10 where 1 is the easiest and 10 is
  the most difficult
* _topics_: an array of strings describing topics relevant to the exercise. We maintain
a list of common topics at https://github.com/exercism/x-common/blob/master/TOPICS.txt. Do not feel like you need to restrict yourself to this list;
it's only there so that we don't end up with 20 variations on the same topic. Each
language is different, and there will likely be topics specific to each language that will
not make it onto the list.

The _difficulty_ rating can be a very rough estimate.

The _topics_ array can be empty if this analysis has not yet been done.

Example:

    "exercises": [
      {
        "slug": "hello-world" ,
        "difficulty": 1,
        "topics": [
            "control-flow (if-statements)",
            "optional values",
            "text formatting"
        ]
      },
      {
        "difficulty": 3,
        "slug": "anagram",
        "topics": [
            "strings",
            "filtering"
        ]
      },
      {
        "difficulty": 10,
        "slug": "forth",
        "topics": [
            "parsing",
            "transforming",
            "stacks"
        ]
      }
    ]

It may be worth making the change in several passes:

1. Add the _exercises_ key with the array of objects, where _difficulty_
   is 1 and _topics_ is empty.
2. Update the difficulty settings to reflect a more accurate guess.
3. Add topics (perhaps one-by-one, in separate pull requests, in order
   to have useful discussions about each exercise).

Note: Edited for readability inline here, line lengths. - KOTP

ErikSchierboom commented 7 years ago

Excellent write-up. 👍

chezwicker commented 7 years ago

I'm just slightly confused by topics simply being "an array of strings that describe topics that the exercise covers". Shouldn't this rather be a reference to centrally maintained topics (in order to avoid calling the same thing different names?). The reference of course can still be textual, but it would be nice to have it checked ;-)

ErikSchierboom commented 7 years ago

@chezwicker makes a good point. Maybe we should put the list of topics I gathered in one of the previous posts into a separate file in the x-common repository, named topics.json or something like that? Then we could also iteratively improve the list of topics through discussions and PR's.

IanWhitney commented 7 years ago

@chezwicker: I think topics will vary by language. Rust, for example, will have Lifetimes and other languages won't. Or several languages will have variations of Maybe/Some but under different names.

I think centralization of these terms might be a premature abstraction.

chezwicker commented 7 years ago

@IanWhitney you're of course right that some topics will vary, but I believe more topics will be similar across languages. And using different words for the same concept has a lot of potential of leading to confusion. I would assume many people with knowledge of one language would be using the platform to learn others - for those, I think it might be helpful to recognize concepts.

Of course you're also right that some languages will call the same concept different names, so maybe aliases would be helpful. Perhaps a central ´´´topics.js´´´ and optionally one per track "renaming" concepts by defining aliases?

Or maybe that's just overengineering now. I'm merely pointing out that it could be nice having some consistency across tracks.

ErikSchierboom commented 7 years ago

@IanWhitney @chezwicker Yes topics will vary by language, so IMHO you should feel free to replace Option to Maybe when that applies to your language track. There will also be subjects that are exclusive to a language (e.g. Active Patterns in F#), which should thus not existing in the "master" list of topics. However, many topics will also be the same. I think it thus makes sense to use one "default" set of topics to choose from, which can then be modified or added to in the specific language tracks.

kytrinyx commented 7 years ago

It sounds like we're aiming for:

In the suggested text, I wrote:

* _topics_: an array of strings that describe topics that the exercise
  covers

How about changing this to the following?

* _topics_: an array of strings describing topics relevant to the exercise. We maintain
a list of common topics at $URL. Do not feel like you need to restrict yourself to this list;
it's only there so that we don't end up with 20 variations on the same topic. Each language
is different, and there will likely be topics specific to each language that will not make it
onto the list.
ErikSchierboom commented 7 years ago

Works for me!

kytrinyx commented 7 years ago

OK, I've updated the text.

I'll also add an empty topics list in x-common. Shall we make it plain text, with one topic per line? It feels like json is a bit overkill since this is just going to be for human consumption.

ErikSchierboom commented 7 years ago

@kytrinyx Ah yes, that's far easier :)

chezwicker commented 7 years ago

Sounds good!

kytrinyx commented 7 years ago

Would someone mind taking a look at the suggested starter file? https://github.com/exercism/x-common/pull/337

I used @ErikSchierboom's list farther up in this thread.

ErikSchierboom commented 7 years ago

I'm sure it can be improved upon, but as a starting point, it's looks fine I think.

kytrinyx commented 7 years ago

Yeah, that's what I was thinking. My first thought was to make it empty, but then I remembered that you'd started a list.

ErikSchierboom commented 7 years ago

@kytrinyx One small question: how would the order in the new situation work? Is it still the order in which the exercises are listed, or is that list first sorted by difficulty? E.g., consider the following data:

"exercises": [
      {
        "slug": "hello-world" ,
        "difficulty": 1,
        "topics": [
            "control-flow (if-statements)",
            "optional values",
            "text formatting"
        ]
      },
      {
        "difficulty": 3,
        "slug": "anagram",
        "topics": [
            "strings",
            "filtering"
        ]
      },
      {
        "difficulty": 1,
        "slug": "binary",
        "topics": [
            "parsing",
            "transforming"
        ]
      }
    ]

Is the exercise order either:

  1. hello-world
  2. anagram
  3. binary

or

  1. hello-world
  2. binary
  3. anagram

I would opt for the second choice.

NobbZ commented 7 years ago

I am definitively against an implicit ordering by difficulty. There might be some educational intention behind handing out the more difficult task before the easier one.

ErikSchierboom commented 7 years ago

@NobbZ good point. So we would just use the order in which the exercises are listed as the order of them being fetched?

ErikSchierboom commented 7 years ago

In @kytrinyx' PR, the topics are copied as-is from my list. Those topics have "normal" casing, such as "Optional values". Should we leave it as it is or use lower-casing?

kytrinyx commented 7 years ago

So we would just use the order in which the exercises are listed as the order of them being fetched?

I think so.

Should we leave it as it is or use lower-casing?

I think normal casing is fine, unless it's easier to be consistent with lower case.

ErikSchierboom commented 7 years ago

Let's keep it normal casing then. That way, we could also display the information on the website if needed.

kytrinyx commented 7 years ago

I hit rate limits. Investigating.

https://developer.github.com/v3#abuse-rate-limits

Apparently I need to make the script wait for a bit between each call.

petertseng commented 7 years ago

We could have a new key, exercises, which contained an array of objects with the problem slug and topics. x-api could be changed to look preferentially at the new key, and fall back to the old one if it's missing.

Simultaneously change the website and tools to support both formats.

Has this been done yet? And if not, I think it would be wise to say here when it has been done. (I looked at x-api and I didn't see it, but I may have missed)

And why do I care? Simply because I want to know whether "Add the exercises key with the array of objects, where difficulty is 1 and topics is empty." can be done simultaneously with "Remove the problems key". Otherwise there will be two problem orderings (one defined by problems, one defined by exercises) and any changes to problem ordering would have to be done in both places.

kytrinyx commented 7 years ago

Simultaneously change the website and tools to support both formats. Has this been done yet?

No not yet. I've opened this issue: https://github.com/exercism/x-api/issues/134

I want to know whether "Add the exercises key with the array of objects, where difficulty is 1 and topics is empty."

Yeah, that's a good point.

Also, we will need to update configlet: https://github.com/exercism/configlet/issues/7

verdammelt commented 7 years ago

Is there / should there be any affect on 'foregone', or 'deprecated' sections of the config file? Will those remain just a list of exercises?

petertseng commented 7 years ago

I want to know whether "Add the exercises key with the array of objects, where difficulty is 1 and topics is empty." can be done simultaneously with "Remove the problems key". Otherwise there will be two problem orderings (one defined by problems, one defined by exercises) and any changes to problem ordering would have to be done in both places.

That's done by https://github.com/exercism/x-api/pull/137, I am glad that we can now deduplicate.

kytrinyx commented 7 years ago

Is there / should there be any affect on 'foregone', or 'deprecated' sections of the config file?

@verdammelt no, I think it's legit to be able to deprecate / forego individual exercises on a track level.

stkent commented 7 years ago

Quick question: is the difficulty value required to be integer, or can we use decimals e.g. 4.7?

[For context, the Java track had several maintainers chime in with difficulty estimates, then took the average to compute an overall difficulty curve. This naturally led to some non-integer scores. Rounding is no problem if necessary, but it feels silly to throw away the "extra fidelity" if we don't have to!]

petertseng commented 7 years ago

Ah, well, I could give the pragmatic answer of "nothing on the website or API shows the difficulty value. Therefore, it can be whatever you want". The Rust track got lazy and we thus only use 1, 4, 7, and 10.

But I guess that's no substitute for the real answer, which will be depend on the eventual intended use.

stkent commented 7 years ago

@petertseng ah, good to know!

From manual inspection of the tracks that have both switched structure and assigned non-trivial difficulties (i.e. not all 1s throughout), integer-only appears to be preferred and would seem to be the safer bet at this time, so I'll go with that!

Examples: