exercism / discussions

For discussing things like future features, roadmap, priorities, and other things that are not directly action-oriented (yet).
37 stars 5 forks source link

Keeping test suites in sync with canonical data #184

Closed kytrinyx closed 6 years ago

kytrinyx commented 7 years ago

One of the things that has been difficult to manage, is keeping exercise test suites up to date as the canonical data changes within the problem-specifications repository.

The canonical-data.json file has a version associated with it. This means that we could keep track of which canonical-data.json version a test suite was implemented against, and then have some tooling that runs periodically to let maintainers know about changes.

As a first implementation, this could be added to configlet and run manually by maintainers. Later we could make this a bot that opens an issue in the track repository when it finds something that is outdated.

We allow each exercise to have a .meta directory, and everything within that directory is ignored. This means that we could have a file in that directory that the tool can use to decide what to do.

There are several concerns here. Note: updated list to reflect discussion below

  1. A track might want to entirely opt out of following canonical data from problem specifications.
  2. A track might selectively want to opt out for a single exercise.
  3. A track might want to lock an exercise to a particular version.
  4. There might not be canonical data for an exercise.
  5. ... ?

Having thought about this a bit, I think that if we do add something, it should probably be a JSON file so we can have multiple fields to help manage all of the potential concerns.

/cc @stkent and @m-a-ge who kicked off this discussion in https://github.com/exercism/discussions/issues/106#issuecomment-323558227

stkent commented 7 years ago

Could you please point me to where the per-exercise .meta directory is documented? Thanks :)

petertseng commented 7 years ago

Haskell track puts the canonical data version (plus an extra monotonically increasing number) inside package.yaml, a file used by Stack, a Haskell build tool. This is done so the version may be shown to the student when running stack test. https://github.com/exercism/haskell/issues/522 https://github.com/exercism/haskell/issues/523

Rust track puts the canonical data version inside Cargo.toml, a file used by Cargo, a Rust build tool. This is done so the version may be shown to the student when running cargo test. https://github.com/exercism/rust/issues/281

The Go track's generator places the version in a comment in the generated test file. https://github.com/exercism/go/pull/606

I think (but do not know) it is helpful to see the version when running the test command, so it seems good to put the version in the place appropriate for the build system for the language. But then, I feel reluctant to additionally put the version information in a JSON file as it's duplication.

If it is too much burden for configlet to learn to read versions from language-dependent places, don't worry about it, we'll just keep using https://github.com/petertseng/exercism-problem-specifications/tree/up-to-date/up-to-date and not configlet.

Later we could make this a bot that opens an issue in the track repository when it finds something that is outdated.

If the state of the world has not changed since https://github.com/exercism/problem-specifications/issues/524, some tracks wish to opt out (I assume that's easy enough to allow, just reminding of its existence)

stkent commented 7 years ago

That's interesting; I'd have thought the folks least likely to care about test versions would be students.

Regarding:

If it is too much burden for configlet to learn to read versions from language-dependent places

it sounds like the .meta directory is a language-independent place to host, at least.

ilya-khadykin commented 7 years ago

Storing canonical data version info in a language specific build tool might be a problem since some tracks do not have one (bash, powershell for example). Dedicated JSON file in .meta directory seems like a good idea and it's a universal solution, not sure about its contents though.

A track might want to lock an exercise to a particular version.

Why does someone want to do that?

kytrinyx commented 7 years ago

Could you please point me to where the per-exercise .meta directory is documented?

@stkent here you go: https://github.com/exercism/docs/tree/master/language-tracks/exercises#files

some tracks wish to opt out

@petertseng Yes, that's what I meant when I said A track might not want to follow the canonical data for an exercise. but now I realize that there are two options here:

  1. A track might want to entirely opt out of following canonical data from problem specifications
  2. A track might selectively want to opt out for a single exercise

Updating the original post to include these two options.

I'd have thought the folks least likely to care about test versions would be students.

Me too.

In the new prototype, we have a lot more control over what someone gets delivered when they download an exercise, and what someone sees. We always show the mentor the test suite that you downloaded. My goal is to remove all maintainer-specific knowledge from what the students see. They shouldn't have to know about versioning (whether specification versioning or test versioning in the track).

A track might want to lock an exercise to a particular version.

Why does someone want to do that?

I don't know if they actually want to, but my thinking is that sometimes edge cases get added to the canonical data, and a track might want to stay with a simpler one.

stkent commented 7 years ago

Thanks! Gonna have to think about how to handle references implementations living there for Java/Kotlin...

kytrinyx commented 7 years ago

@stkent the Ruby track has put the reference implementation inside .meta. If it's named with [Ee]xample in the path or something, then Configlet won't complain. If it's not then update the solution_pattern in your config.json (here's Ruby's: https://github.com/exercism/ruby/blob/master/config.json#L6)

That said, the tricky thing with Java/Kotlin seems to be the convention around paths, so I don't know how this is going to play out.

stkent commented 7 years ago

Yes, it was that convention I had in mind. Thanks for the Ruby example!

ErikSchierboom commented 7 years ago

At the moment, the C# track stores it in the test file as a comment (for the exercises that use automatic generation).

stevejb71 commented 7 years ago

Ocaml also puts the version number in the test file, when automatically generated.

Insti commented 7 years ago

Ruby:

We (currently manually) re-generate all the test files an see any which have changed using our test generation script: bin/generate --all

We can tell which need updating because the newer generated test file differs from the existing one.

We also store the version number and short-sha1 of the canonical-data.json in the test file. For example: pangram_test.rb contains the line:

# Common test data version: 1.1.0 fba1aef

(although this is not currently used for anything other than informational purposes.)

Exercises without generators just get out of date until someone notices and builds a generator for it.

stkent commented 6 years ago

I just opened a PR that moves Java's version information from test files into a .version file directly inside .meta. This is a step closer to 'not user facing' for us (yay, nextercism). Just putting this out there in case anyone is still weighing global options for alert or issue-creation tooling. I might take a stab at track-local tooling if nothing global surfaces.

Stargator commented 6 years ago

Currently, the Dart track stores the solution as "example.dart" in the same directory as the blank file.

So as far as the solution, we will not be able to find the solution if it's in .meta. Dot folders are not proper directories for Dart files. It prevents them from being used by the test suite or any other Dart file.

So maybe have the location of the solution file be configurable?

Additionally, via configlet generate we take the contents of description.md and make it the exercise's README.md.

kytrinyx commented 6 years ago

So maybe have the location of the solution file be configurable?

As far as I can recall, this is currently the case.

https://github.com/exercism/docs/blob/f4f74213e215d14a22e1ce724105528f713f76f6/language-tracks/exercises/anatomy/reference-solution.md

kytrinyx commented 6 years ago

I wrote up a proposal for a fairly light-weight bot that would not make any changes itself, but would notify relevant tracks of changes so that maintainers can evaluate whether or not they'd like to implement them. See https://github.com/exercism/meta/issues/99 for details.