canonical-data.json standardisation discussion (was: Malformed data?)

exercism / problem-specifications

Shared metadata for exercism exercises.

MIT License

325 stars 542 forks source link

canonical-data.json standardisation discussion (was: Malformed data?) #336

Closed zenspider closed 7 years ago

zenspider commented 8 years ago

It appears that all-your-base.json is malformed. Where allergies.json has the structure of:

{
    "allergic_to": {
        "description": [ ... ],
        "cases": [ { "description": "...", ... } ... ]
            }, ...
}

all-your-base.json has:

{
  "#": [ ... ],
  "cases": [ ... ]

cases should be wrapped in a function name, yes?

It appears that bin/jsonlint only checks that the json parses, not that it has good structure.

At the very least, I think this should be patched up and the README expanded to actually show the desired structure. Happy to do a PR for that, assuming I understand it already. 😀

zenspider commented 8 years ago

Looks like there are a lot of different structures involved. Please provide hints as to the correct syntax so I can parse this stuff.

kytrinyx commented 8 years ago

Looks like there are a lot of different structures involved.

Yeah, this sort of happened a bit at a time, and we weren't sure what the various needs of this data were going to be.

We now have enough data to decide on a file format, but I don't think anyone has gone through and figured out what the syntax should be yet.

Insti commented 8 years ago

@zenspider sounds like you're writing a parser, perhaps you can look through the existing data and tell us what structure we should be using to make parsing convenient. Then we can document it and create issues to update the old data.

kytrinyx commented 8 years ago

@devonestes This is the issue we were talking about on twitter.

catb0t commented 8 years ago

I'm just gonna collect my thoughts from #376 here, because I think this needs fleshing out.

I believe we can simultaneously make the JSON easier for humans and programs to read, but the way it is now makes it very hard to make a generalising program.

@petertseng linked to examples of code in various tracks using canonical-data.json to generate exercises, and I feel they all share a common problem: because each exercise has a different structure, each exercise needs its own separate, different test generator program.

My goal with exercism.autogen-exercises is to generate all the tests for all the exercises at once which should be trivially possible. I don't want a different ${exercisename}-testgen.factor for each different JSON structure.

As it is right now, I could theoretically write code to map x-common's JSON keys to my own internal structure, but this requires a duplication across programs that read this data. Also, it's not scalable, and as such it would be genuinely beneficial to everyone to standardise the keys and their meanings.

I am personally willing to manually rewrite all the JSON in this repository to fit a predictable format, but I won't until we have a consensus.

NobbZ commented 8 years ago

I'd fully support a more generic structure which would make it unnecessary to have a generator for each exercise.

But I have to admit, I have no idea how it could look like. Since you already said you would change them, do you have an idea about the structure already @catb0t?

Also since it seems to be the right time, I want to request a feature for this generic format:

I had a sleepness night, of how I should handle changes in the canonical data as I wanted to have some versioning test. First I thought I'd could just use the date of the last change, but this would mean, that because of whitespace changes all earlier submissions would get "invalidated". Therefore I think it would be a good idea to version the canonical data as well.

catb0t commented 8 years ago

@catb0t wrote:

I'm thinking something like:

For exercises with one input translating to one output, description, input and output.

For exercises with multiple inputs / multiple outputs, description, input_N, output_N.

Note that it would be disadvantageous to use an array for multiple inputs / outputs where an array is not part of the exercise because it would be hard or impossible to tell the difference between multiple inputs and an actual array. We could have keys like input_multi which is an array of inputs, I suppose?

@petertseng wrote

For exercises with multiple inputs / multiple outputs, description,input_N, output_N.

[ ... ] Can we simultaneously make it easy for a human to read as well? [ ... ] in e.g. all-your-base's JSON, [ ... ] many tracks will pass in three inputs: input_base, input_digits, output_base, and then check that the output digits are as specified in output_digits. If the data then simply looked like "input_1": 2, "input_2": [1], "input_3": 10, "output": [1] I think it might not be clear what is the difference between input_1 and input_3 to a human, and I consider this important for being able to understand PRs that propose to change the test cases.

@petertseng makes a good point that input_N, etc, might harm readability especially since there are no comments in JSON, and I'm not really sure what to do about that.

I don't have a firm idea of what keys would fix Peter's point, which is a reason I haven't started rewriting it all myself yet.

Using descriptive English names makes it hard to access them programmatically, but using numbered keys makes it hard for people (not me, but other maintainers) to read. What strikes a balance?

This might be a little bit wild, so bear with me: what if we add a top-level key metadata, and it has this structure:

"cases": { "cases data..." }
"metadata": {
    "input_keys": [ "input_key1", "input_key2", "input_key3" ],
    "output_keys": [ "output_keyN" ]
}

That moves the mapping of human-readable keys from each track's generation code to the JSON itself. Then autogeneration code can read metadata to get the list of keys that are used in this cases structure.

[ ... ]How should I handle changes in the canonical data as I wanted to have some versioning test. [ ... ] I could just use the date of the last change, but this would mean, that because of whitespace changes all earlier submissions would get "invalidated". Therefore I think it would be a good idea to version the canonical data as well.

We could perhaps end up with:

"#": "..."
"cases": { "cases data..." }
"metadata": { "..." }
"version": {
    "version_hash": "shasum of minified version of this file",
    "version_time": "seconds since 1 Jan 1970 here"
}

And you can read the version key. Or perhaps I'm misunderstanding your point.

NobbZ commented 8 years ago

I do not understand the input_N stuff, but there came something into my mind.

{
  "exercise": "repeat",
  "examples": [
    {
      "function": "repeat",
      "description": "tests valid stuff",
      "input_count": 5,
      "input_string": "foo",
      "expected": "foofoofoofoofoo"
    },
    {
      "function": "repeat",
      "description": "tests failure",
      "input_count": -5,
      "input_string": "foo",
      "expected": { "error": "no negatives allowed" }
    }
  ]
}

Perhaps we can use this as a base, or throw it away instantly?

zenspider commented 8 years ago

@NobbZ:

    {
      "function": "repeat",
      "description": "tests failure",
      "input_count": -5,
      "input_string": "foo",
      "expected": { "error": "no negatives allowed" }
    }

and what ensures the order of the args? There's no metadata in place to declare argument names.

zenspider commented 8 years ago

@catb0t:

My goal with exercism.autogen-exercises is to generate all the tests for all the exercises at once which should be trivially possible [emphasis mine]. I don't want a different ${exercisename}-testgen.factor for each different JSON structure.

I don't. I think you can get a good start on it for most languages, but that idea doesn't take into consideration language call semantic differences (factor/forth vs assembly vs algol-based languages vs keyword arguments (smalltalk, ruby) as an example). Nor is it realistic about the level of finality. I think you can easily generate a rough draft for every exercise for a language, but it still needs to be reviewed, finalized, and styled by a human to be a good example to learn from.

NobbZ commented 8 years ago

I do not see any sense in specifying order of arguments in the canonical testdata. There are different idioms and necessities in the various tracks.

Let's assume we have some data type and we write functions around it. Let's call it list. In object oriented languages it will be the object we call a method in so it will be completely out of the order of arguments. In elixir we like to have this object like argument at the first position to be able to pipe it around, while in Haskell it is preferred to have it last to be able to use point free style and partial application.

So as you can see order of arguments has to be specifies by the tracks maintainer a anyway.

Ryan Davis notifications@github.com schrieb am Mi., 21. Sep. 2016 23:47:

{
  "function": "repeat",
  "description": "tests failure",
  "input_count": -5,
  "input_string": "foo",
  "expected": { "error": "no negatives allowed" }
}
and what ensures the order of the args? There's no metadata in place to declare argument names.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/exercism/x-common/issues/336#issuecomment-248752946, or mute the thread https://github.com/notifications/unsubscribe-auth/AADmR8LftAf2rePU2wUD_ZgXKqYCjrzbks5qsaYCgaJpZM4JjxYn .

rbasso commented 8 years ago

Maybe I'm a little late and out of topic, but I'll try anyway...

About automatically generated test suites

I know that it makes sense in some languages to think about automatically generating tests, but I belive that this is not a goal shared between all tracks.

I think it is impossible, in the general case, to auto-magically generate the test suite, unless we collapse all the types into the ones representable in JSON. I know that, at least in Haskell, that would be bad and wrong! :smile:

That said, it is certainly possible to have a generator to automatically update a specific exercise, if the JSON structure is not changed.

Is it worthy?

That depends on how frequently the data and the structure are updated, but mostly on how fun is the process of writing and maintaining it. So I think it is not unreasonable. :+1:

Alternatively - if the desire is really to have auto-magic test suites - it would be more compatible if the exercises where specified as stdin-stdout mappings. That would be similar to how online judge systems work, but I don't think it is exercism's destiny to follow that path.

About readability for humans and software

Considering that it is generally impossible to automatically generate test suites, I think it doesn't make sense to sacrifice human-readability too much, forging a JSON that is convenient for software but inconvenient for humans.

That doesn't mean we shouldn't standardize the files. We should, but remembering that the files are meant to be read first by humans, and then by software.

About the format

~~Maybe I'm the only one that doesn't got what is going on here, but I think that, until it is clear what is our goal here, we should avoid getting into the details of the specification.~~

Edit: Ok. I think I got it!. :smile:

rbasso commented 8 years ago

What about something like this:

{
  "exercise": "cipher",
  "version": "0.1.0 or an object with more detailed information",
  "comments": [
    "Anything you can think of",
    "as a list of strings"
  ],
  "tests": [
    {
      "name": "encode",
      "description": "Encodes plaintext",
      "cases": [
        {
          "description": "Encodes simple text",
          "plaintext": "Secret message",
          "key": "asdf1234",
          "expected": "qwertygh"
        },
        {
          "description": "Encodes empty string",
          "plaintext": "",
          "key": "test1234",
          "expected": ""
        }
      ]
    },
    {
      "name": "decode",
      "description": "Decodes plaintext",
      "cases": [
        {
          "description": "Decodes simple text",
          "ciphertext": "qwertygh",
          "key": "asdf1234",
          "expected": "Secret message"
        },
        {
          "description": "Decodes empty string",
          "ciphertext": "",
          "key": "test1234",
          "expected": ""
        }
      ]
    }

:+1: Allows groups of tests.
:+1: Uses standard names where it doesn't affect human-readability.
:+1: Captures most of the structure of existing tests.
:+1: Doesn't damage readability too much.
:-1: Mixes description, inputs and output in the same object.
:-1: Gives special meaning for two case's keys: description and expected.
:-1: Has no explicit ordering of cases' input data.

The descriptions could be mandatory or optional.

I would be possible to use multilevel grouping of tests, but I don't think that is used frequently.

Keeping the description, inputs and the expected output together, we have a structure that is more human-friendly, but not so convenient for processing.

@zenspider and @catb0t, would it be too difficult to separate description and expected from the other keys? Would it be reasonable for you to use an implicit alphabetic ordering for the remaining keys, instead of adding metadata?

devonestes commented 8 years ago

I've been thinking about this a bit recently, and I think the most generalized version of this we can get might be the best for as many different needs as possible. What we're really doing in most of these exercises is basically testing functions. There's input, and there's output. By trying to use keys in our JSON objects that are things like "plaintext" and "key", that's creating a need for knowledge about the exercise to accurately understand how those parts interact.

I think if we can generalize on that concept of a function that we're testing, that might be helpful both for human readability, and also for machine readability so we can possibly use this data for automatic tests.

So, here's my example:

{
  "exercise": "cipher",
  "version": "0.1.0 or an object with more detailed information",
  "comments": [
    "Anything you can think of",
    "as a list of strings"
  ],
  "tests": [
    {
      "description": "encodes simple text",
      "function": "encode",
      "input": ["Secret message", "asdf1234"],
      "output": "qwertygh"
    },
    {
      "description": "encodes empty string",
      "function": "encode",
      "input": ["", "test1234"],
      "output": ""
    },
    {
      "description": "decodes simple string",
      "function": "decode",
      "input": ["qwertygh", "asdf1234"],
      "output": "Secret message"
    }
  ]
}

I don't think there are any exercises that require anything other than input and output, but I haven't done too deep of an analysis on that. I'd love any feedback if there are edge cases that would need to be taken care of here. I know that based on the structure above I can think of reasonable ways to parse that and automatically create some skeletons for tests in Ruby, Elixir, Go, JavaScript and Python, but that's really all I can reasonably speak to since those are the only languages I have a decent amount of experience with.

Also, I sort of like the stripped down way of looking at this - when I look at that data I don't need to know the context of the exercise to know what's going on. I just know there's a thing called encode, and that takes some input and returns some output, and there's a text description of what's going on.

I'm not really 100% sure that this would give us everything we want, but I wanted to at least throw this idea out there to get feedback and see if it might be a starting point for an actually good idea!

rbasso commented 8 years ago

What we're really doing in most of these exercises is basically testing functions. There's input, and there's output.

I think that the general case would be to test assertions...

      "name": "reversibility",
      "description": "Decoding a text encoded with the same key should give the original plaintext",
      "cases": [
        {
          "description": "Only letters",
          "plaintext": "ThisIsASecretMessage",
          "key": "test1234",
        },

... that can be general - like properties, in QuickCheck - or specific, like our common tests.

But I agree that most - if not all - tests are in the form: function inputs == output.

Also, I sort of like the stripped down way of looking at this - when I look at that data I don't need to know the context of the exercise to know what's going on.

This is probably where I disagree...

Maybe we don't need to know the context, but sometimes we want to.

The ability to group tests is so pervasive that I cannot find a single test framework in Haskell that doesn't allow it:

HUnit
HSpec
Tasty

I just know there's a thing called encode, and that takes some input and returns some output, and there's a text description of what's going on.

Exactly! Substituting the keys by a list of arguments, the only thing we know is that there is something that takes inputs and gives an output. We don't know the meaning of those things anymore!

I understand that your proposal makes automatic generation of tests easier while keeping reasonable readability, @devonestes, but that still comes at a price!

The real question

Seems to me that the question that we have to answer is:

Are we are willing to exchange a high-level description of tests by a low-level description of function calls, in order to make completely automated test generation feasible in some tracks?

devonestes commented 7 years ago

@rbasso I see your points, and I actually think we can get a little more of the benefit that you mention. How about something like this:

{
  "exercise": "cipher",
  "version": "0.1.0 or an object with more detailed information",
  "comments": [
    "Anything you can think of",
    "as a list of strings"
  ],
  "tests": [
    {
      "description": "encodes simple text",
      "function": "encode",
      "input": {
        "plaintext": "Secret message",
        "key": "asdf1234"
      },
      "output": "qwertygh"
    }
  ]
}

For the interest of programmatically generating tests, we know what our inputs are (and we can easily ignore the human-specific context in the keys in that object and just look at the values), but for the purpose of assigning some meaning to this data, we can give some context-specific information by adding those keys to the input object.

I think with the above structure we still don't need to understand the context to figure out what's going on, but if we want context it's there for us. I actually think this is a much better version than the original one!

I guess if I were to generalize the structure of a test object in that JSON, it would be this:

{
  "description": "description of what is being tested in this test",
  "function": "name of function (or method) being tested",
  "input": {
    "description of input": "actual input (can be string, int, bool, hash/map, array/list, whatevs)"
  },
  "output": "output of function being tested with above inputs"
}

So, I actually kind of like that. What does everyone else think?

behrtam commented 7 years ago

I especially like the idea of adding the version and the function key. I'm currently working on adding test data versioning (which ruby and go already have) and test generation to the Python track, so it would be great if we could agree on a standard format.

catb0t commented 7 years ago

The reason I stopped commenting despite the fact that I'm the one who re-kindled this thread is that these replies really disheartened me:

I think you can get a good start on it for most languages, but that idea doesn't take into consideration language call semantic differences (factor/forth vs assembly vs algol-based languages vs keyword arguments (smalltalk, ruby) as an example). ... I think you can easily generate a rough draft for every exercise for a language, but it still needs to be reviewed, finalized, and styled by a human to be a good example to learn from.

...I know that it makes sense in some languages to think about automatically generating tests, but I belive that this is not a goal shared between all tracks. I think it is impossible, in the general case, to auto-magically generate the test suite...

Then what is the goal of this discussion about JSON format at all, if you're not interested in programmatically processing the JSON data to generate the unit tests?

Moreover, I don't see why language-specific differences matter here -- my point was that totally disregarding ALGOL syntax and Ruby keyword arguments and Haskell data types, if everything is just a string you can write a generator to write out tests files (and example files too), and since there are already exercise-specific test generators, why not save yourselves the work and write a generic one with better-designed data? (Yes, you should still read and comment the output of the generator for good measure.)

zenspider commented 7 years ago

I'm sorry you found my comments disheartening. I just think that your notion: "to generate all the tests for all the exercises at once which should be trivially possible" ignores the fact that you're mechanically generating tests for consumption across a bunch of languages with widely different styles and semantics.

That is going to wind up with "least common denominator" tests. All I was suggesting is that mechanically generated tests will be a good rough draft, but that they should be worked on by humans so that they are good pedagogical examples for each language. To skip out on that is to kinda miss the point of exercism in the first place.

For example, I have found a world of difference in the quality of tests and their ability to help teach me the language and assist me in understanding in rust's tests. Some of them are night and day in difference, and the worst ones were the ones that did a bare minimum "least common denominator" approach.

rbasso commented 7 years ago

I'm the author of one of the disheartening comments, @catb0t, so I think I owe some explanations.

First of all, I believe that it is good to standardize the structure of a JSON. I just disagree a little in the goals.

Then what is the goal of this discussion about JSON format at all, if you're not interested in programmatically processing the JSON data to generate the unit tests?

I believe that the JSON data has two complementary goals:

Communicate which cases should be considered when writing the test suite.
Make implementing/updating the exercises easier, automatically or manually.

I still disagree about oversimplifying the format to make it easy to automatically generate the tests. This may be extremely valuable in an online judge, because it needs to automatically generate identical tests for a bunch of languages, but it would probably make the exercises less interesting is some languages, as @zenspider already said.

Moreover, I don't see why language-specific differences matter here -- my point was that totally disregarding ALGOL syntax and Ruby keyword arguments and Haskell data types, if everything is just a string you can write a generator to write out tests files (and example files too)...

You are right, if everything is just strings!

But I'm not sure if people here like the idea of having all the exercises as stdin-stdout filters.

devonestes commented 7 years ago

Ok, it seems to me like we've all sort of agreed (in our own ways) that this is a rather difficult problem to solve - so how about we try to make this into a couple smaller problems and tackle them individually? 😉

From what I see, we have two distinct goals we're trying to achieve here:

1) Consistency in format allows for easier human readability of the files, which means an easier time understanding and maintaining them.

2) It's possible that if things are consistent enough and we come up with a good enough abstraction, we could programmatically generate the beginnings of test files for some types of language tracks.

Both are indeed noble goals with clear value, and I totally think we should strive to achieve them both - just maybe not at the same time?

Since goal number 2 is clearly really hard, how about we try and get something that's at least solving goal number 1, and then once that's done we can try and refine it further to accomplish goal number 2? I think limiting the scope of what we're trying to accomplish (with an eye towards the future of course) will be realy helpful in actually getting something shipped here.

Excelsior

rbasso commented 7 years ago

It seems that this issue is dead for a while...

Let's try to push a little further the idea proposed by @devonestes!

Since goal number 2 is clearly really hard, how about we try and get something that's at least solving goal number 1, and then once that's done we can try and refine it further to accomplish goal number 2? I think limiting the scope of what we're trying to accomplish (with an eye towards the future of course) will be realy helpful in actually getting something shipped here.

I have been playing with the JSON files this week and I have some ideas on how we can extract most of the current test structure without sacrificing readability or enforcing too much.

Intro

This will be a really long post, so grab your coffee mug and try not to sleep because I need some feedback here! 😄

Test suite structure

Grouping

Some test suites have tests grouped with labels:

from `acronym`

{
   "abbreviate":{
      "description":"Abbreviate a phrase",
      "cases":[
         {
            "description":"basic",
            "phrase":"Portable Network Graphics",
            "expected":"PNG"
         }
      ]
   }
}

Grouping tests adds readability to both, the JSON file and the generated tests, so I believe that we should keep this feature somehow.

Heterogeneous groups

In the example above, the custom name abbreviate was used to group and also identify the type of the tests to be performed. This is an easy solution but is also a little too restrictive. It would be useful to group distinct types of tests:

Heterogeneous group example

{
   "group":{
      "description":"Qwerty",
      "cases":[
         {
            "encode":{
               "description":"Qwerty encoding",
               "plaintext":"Sample plaintext",
               "ciphertext":"adsdfsjqwreiugi"
            }
         },
         {
            "decode":{
               "description":"Qwerty decoding",
               "ciphertext":"adsdfsjqwreiugi",
               "plaintext":"sampleplaintext"
            }
         }
      ]
   }
}

We could also have encoded the test types in other ways but what matters here is that moving the test-type specification near the test data, we gained the ability to create heterogeneous test groups!

Nested grouping

Decoupling the grouping logic from the test types, we could even nest test groups with varying depths:

{
   "group":{
      "description":"mathematics",
      "tests":[
         {
            "group":{
               "description":"basic math",
               "tests":[
                  {
                     "addition":{
                        "description":"simple addition",
                        "left":1,
                        "right":2,
                        "expected":3
                     }
                  },
                  {
                     "subtraction":{
                        "description":"simple subtraction",
                        "left":3,
                        "right":2,
                        "expected":1
                     }
                  }
               ]
            }
         },
         {
            "division":{
               "description":"awesome division by zero",
               "left":1,
               "right":0,
               "expected":"Only Chuck Norris can divide by zero!"
            }
         }
      ]
   }
}

That may seem unneeded and a little too complex, but it comes almost for free! Also, it is good to have some flexibility for the more complex test suites we may want to create.

A generator could simply ignore all the test grouping and just recursively scan for the tests - flattening the structure - or it could use the grouping information to construct a completely labeled test tree, if the test framework allows it.

Test case specification

The challenge here is to enforce a minimal structure on all tests, without losing any readability or flexibility.

Previous discussions indicate that there is no consensus about encoding input and output, so we should avoid discussing that now and focus on things that will not start a language war.

To allow easy, semi-automatic generation of tests, I think it would be convenient to have at least the following information about a test:

description - With it the test generators have a textual description to display in case of success/failure. Also, it allows users and maintainers to refer to a specific test case in a language-independent way. Tests without descriptions would leave the users in a situation where they cannot easily identify where they failed, so it makes sense to enforce their presence.
type - At least implicitly, any test case has a type that identifies a property being tested, most of the times the name of a test function. What matters here is that we need a unique identifier for each kind of test in a test suite, so what we don't end up in a situation where it is impossible to automatically identify the type of each test case.

Ambiguous test type example

{
   "group":{
      "description":"Qwerty",
      "cases":[
         {
            "test":{
               "description":"Qwerty encoding",
               "plaintext":"Sample plaintext",
               "ciphertext":"adsdfsjqwreiugi"
            }
         },
         {
            "test":{
               "description":"Qwerty decoding",
               "ciphertext":"adsdfsjqwreiugi",
               "plaintext":"sampleplaintext"
            }
         }
      ]
   }
}

I see ~~two~~ three options to signal the test type:

Using a unique key for each test type

This is readable and easy enough to parse, but it doesn't exposes the fact that all the test cases have a description.

{
   "decode":{
      "description":"Qwerty decoding",
      "ciphertext":"adsdfsjqwreiugi",
      "plaintext":"sampleplaintext"
   }
}

Using a unique key inside a `test` key

This captures more structure but is not so nice to the eyes.

{
   "test":{
      "description":"Qwerty decoding",
      "decode": {
         "ciphertext":"adsdfsjqwreiugi",
         "plaintext":"sampleplaintext"
      }
   }
}

Edit: Key-value pair option

Adding a key-value pair to identify the test

This is a little less readable than the first option, but may be interesting for parsing.

{
   "test":{
      "type":"decode",
      "description":"Qwerty decoding",
      "ciphertext":"adsdfsjqwreiugi",
      "plaintext":"sampleplaintext"
   }
}

~~The first option is more pleasant to the eyes and is similar to what we already use, so it makes sense to stick with it unless we find a reason to avoid it.~~

It would be nice to have some arguments in favor or against each of these three alternatives.

JSON Schema

I'm still trying to write a schema to allow automatic validation of the canonical-data.json files, but I decided that it was already time to discuss the idea publicly, so that we could improve it together.

Edit: Remember about exercise, version and comments.

Proof of concept

Following these ideas, I rewrote exercises/bob/canonical-data.json to test the concept in a simple case:

{
   "group":{
      "description":"bob",
      "tests":[
         {
            "response":{
               "description":"stating something",
               "input":"Tom-ay-to, tom-aaaah-to.",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"shouting",
               "input":"WATCH OUT!",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"shouting gibberish",
               "input":"FCECDFCAAB",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"asking a question",
               "input":"Does this cryogenic chamber make me look fat?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"asking a numeric question",
               "input":"You are, what, like 15?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"asking gibberish",
               "input":"fffbbcbeab?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"talking forcefully",
               "input":"Let's go make out behind the gym!",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"using acronyms in regular speech",
               "input":"It's OK if you don't want to go to the DMV.",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"forceful question",
               "input":"WHAT THE HELL WERE YOU THINKING?",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"shouting numbers",
               "input":"1, 2, 3 GO!",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"only numbers",
               "input":"1, 2, 3",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"question with only numbers",
               "input":"4?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"shouting with special characters",
               "input":"ZOMG THE %^*@#$(*^ ZOMBIES ARE COMING!!11!!1!",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"shouting with no exclamation mark",
               "input":"I HATE YOU",
               "expected":"Whoa, chill out!"
            }
         },
         {
            "response":{
               "description":"statement containing question mark",
               "input":"Ending with ? means a question.",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"non-letters with question",
               "input":":) ?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"prattling on",
               "input":"Wait! Hang on. Are you going to be OK?",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"silence",
               "input":"",
               "expected":"Fine. Be that way!"
            }
         },
         {
            "response":{
               "description":"prolonged silence",
               "input":"          ",
               "expected":"Fine. Be that way!"
            }
         },
         {
            "response":{
               "description":"alternate silence",
               "input":"\t\t\t\t\t\t\t\t\t\t",
               "expected":"Fine. Be that way!"
            }
         },
         {
            "response":{
               "description":"multiple line question",
               "input":"\nDoes this cryogenic chamber make me look fat?\nno",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"starting with whitespace",
               "input":"         hmmmmmmm...",
               "expected":"Whatever."
            }
         },
         {
            "response":{
               "description":"ending with whitespace",
               "input":"Okay if like my  spacebar  quite a bit?   ",
               "expected":"Sure."
            }
         },
         {
            "response":{
               "description":"other whitespace",
               "input":"\n\r \t",
               "expected":"Fine. Be that way!"
            }
         },
         {
            "response":{
               "description":"non-question ending with whitespace",
               "input":"This is a statement ending with whitespace      ",
               "expected":"Whatever."
            }
         }
      ]
   }
}

To check how hard it could be to parse the file, I rewrote the test suite to run the tests directly from the JSON file.

{-# LANGUAGE OverloadedStrings #-}

-- Basic imports
import Control.Applicative ((<|>), liftA2)
import Control.Monad       ((>=>))

-- To construct the tests.
import Test.Hspec          (Spec, describe, it)
import Test.Hspec.Runner   (configFastFail, defaultConfig, hspecWith)
import Test.HUnit          (assertEqual)

-- To parse the JSON file.
import Data.Aeson          ((.:), eitherDecodeStrict', withArray, withObject)
import Data.Aeson.Types    (Parser, Value, parseEither)
import GHC.Exts            (toList)

-- To read the JSON file.
import Data.ByteString     (readFile)
import Prelude     hiding  (readFile)

-- The module to be tested.
import Bob (responseFor)

-- Read, decode and run the tests.
main :: IO ()
main  = readJSON >>= parseOrError parseJSON >>= runTests
  where
    readJSON       = readFile "test/canonical-data.json"
    parseOrError p = either error pure . p
    parseJSON      = eitherDecodeStrict' >=> parseEither (parseTests parsers)
    runTests       = hspecWith defaultConfig {configFastFail = True}

    -- List of exercise-specific parsers
    parsers = [ parseResponse ]

-- | Exercise-specific parser for "response" tests.
parseResponse :: Value -> Parser Spec
parseResponse = withObject "response" $ \o -> do
    test        <- o    .: "response"
    description <- test .: "description"
    input       <- test .: "input"
    expected    <- test .: "expected"
    return $ it description $
                  assertEqual ("responseFor " ++ show input)
                    expected
                    (responseFor input)

-- | Exercise-independent JSON parser.
parseTests :: [Value -> Parser Spec] -> Value -> Parser Spec
parseTests ps = foldr (liftA2 (<|>)) mempty (parseGroup : ps)
  where
    parseGroup = withObject "group" $ \o -> do
        group       <- o     .: "group"
        description <- group .: "description"
        tests       <- group .: "tests"
        specs       <- withArray "tests" (traverse (parseTests ps) . toList) tests
        return . describe description . sequence_ $ specs

This is still experimental code, so don't take it seriously, but note that only 12 lines of code are exercise-specific. All the others lines are exercise independent!

I avoided any trick to make this easier in Haskell, so the parsing is verbose and feels a little clumsy. Changing the JSON file would make parsing way easier, but that would favor the Haskell track in detriment of other languages and human-readability.

Final comments

Well, this is all I got for now...

I think that, if we decide to follow this path, in the short term we can expect to:

Automatically validate canonical-data.json files in Travis-CI.
Simplify test generators by sharing more code among exercises.

I deliberately avoided specifying inputs and outputs from the tests for a few reasons:

Nobody agrees about what they should be.
Data encoding is not so language neutral as people might think.
I believe we need more experience with test data standardization before jumping again in that discussion.

Anyone think it is an useful endeavor to standardize just that for now?

abo64 commented 7 years ago

@rbasso Of course this is a useful endeavor! :) I am also working on some test generator for Scala. So let me just add my two cents:

No need to say that a standard JSON schema is very helpful. Your Proof of Concept looks promising.
As you can see in the discussion of my PR it seems preferable for some to have the test suite in a separate file instead of immediately using the parse results like you did.
If that is common consent one could maybe have
- a JSON parser with a standardized output O for all canonical-data.json files
- a code generator that takes some standardized input I and generates the exercise's test suite
- some glue code for each exercise to map O to I

Now the next question could be: Must all of this be 100% language-specific, or how much can be shared and how?

rbasso commented 7 years ago

As you can see in the discussion of my PR it seems preferable for some to have the test suite in a separate file instead of immediately using the parse results like you did.

I agree! I just used the tests as a parsing example, to see if the format would be too inconvenient.

Must all of this be 100% language-specific, or how much can be shared and how?

Tell me if you find out the answer. 😄

rpottsoh commented 7 years ago

What other test types might there be?

rbasso commented 7 years ago

Example of test type that are not a single function would be the following properties:

Negating two times should be the same as identity.
Reversing two string and concatenating then in opposite order is the same as reversing the concatenated string.

I agree that in most of the cases we are testing the return of a function implemented by the user, but it would be nice to use a more general name.

I'll try to rewrite with the following changes:

tests -> group
testType -> test

ps: I deleted that post because I was rewriting it with major modifications. sorry.

rbasso commented 7 years ago

I hope it is better now!

{
   "exercise":"bob",
   "version":"1.0.0",
   "comments":[
      "I am a comment"
   ],
   "group":[
      {
         "description":"foo",
         "group":[
            {
               "test":"response",
               "description":"stating something",
               "input":"Tom-ay-to, tom-aaaah-to.",
               "expected":"Whatever."
            },
            {
               "test":"response",
               "description":"stating the same thing again",
               "input":"Tom-ay-to, tom-aaaah-to.",
               "expected":"Whatever."
            }
         ]
      },
      {
         "description":"bar",
         "group":[
            {
               "test":"response",
               "description":"shouting",
               "input":"WATCH OUT!",
               "expected":"Whoa, chill out!"
            }
         ]
      }
   ]
}

rbasso commented 7 years ago

And here is my first JSON Schema. If anyone has any experience with it, I would love suggestions on how to improve it.

{
   "$schema":"http://json-schema.org/draft-04/schema#",
   "$ref":"#/definitions/top",
   "definitions":{
      "comments":{
         "type":"array",
         "items":{
            "type":"string"
         },
         "minItems":1
      },
      "description":{
         "type":"string"
      },
      "exercise":{
         "type":"string"
      },
      "group":{
         "type":"array",
         "items":{
            "$ref":"#/definitions/testOrLabeledGroup"
         },
         "minItems":1
      },
      "labeledGroup":{
         "type":"object",
         "required":[
            "description",
            "group"
         ],
         "properties":{
            "description":{
               "$ref":"#/definitions/description"
            },
            "group":{
               "$ref":"#/definitions/group"
            }
         },
         "additionalProperties":false
      },
      "test":{
         "type":"object",
         "required":[
            "test",
            "description"
         ],
         "properties":{
            "test":{
               "$ref":"#/definitions/testType"
            },
            "description":{
               "$ref":"#/definitions/description"
            }
         }
      },
      "testOrLabeledGroup":{
         "oneOf":[
            {
               "$ref":"#/definitions/test"
            },
            {
               "$ref":"#/definitions/labeledGroup"
            }
         ]
      },
      "testType":{
         "type":"string"
      },
      "top":{
         "type":"object",
         "required":[
            "exercise",
            "version",
            "group"
         ],
         "additionalProperties":false,
         "properties":{
            "exercise":{
               "$ref":"#/definitions/exercise"
            },
            "version":{
               "$ref":"#/definitions/version"
            },
            "comments":{
               "$ref":"#/definitions/comments"
            },
            "group":{
               "$ref":"#/definitions/group"
            }
         }
      },
      "version":{
         "type":"string"
      }
   }
}

rbasso commented 7 years ago

Finally, after fighting the JSON Schema language for a while, I think I got a proposal that can serve as a starting schema for discussion. I expect it to be:

Very human-readable.
Easy enough to parse.
Flexible enough to capture any reasonable test structure.
Similar to what we already have, so migration should be easy.

Here is a sample test file:

{
  "exercise":"foobar",
  "version":"0.1.0",
  "comments":[
    "We are",
    "comments!"
  ],
  "group":[
    {
      "foo":{
        "description":"foo the void",
        "input":"",
        "expected":"foo"
      }
    },
    {
      "bar":{
        "description":"bar the void",
        "input":"",
        "expected":"bar"
      }
    },
    {
      "description":"snafu",
      "group":[
        {
          "foobar":{
            "description":"foo and bar",
            "input":"...wait for it...",
            "expected":"foo...wait for it...bar"
          }
        }
      ]
    }
  ]
}

And here is the JSON Schema, formatted in a very unusual way for easier understanding (at least for me):

{
   "$schema": "http://json-schema.org/draft-04/schema#",
   "$ref"   : "#/definitions/canonicalData",

   "definitions":{

      "canonicalData":
          { "type"      : "object"
          , "required"  : ["exercise" , "version" , "group"]
          , "properties":
                { "exercise": { "$ref": "#/definitions/exercise" }
                , "version" : { "$ref": "#/definitions/version"  }
                , "comments": { "$ref": "#/definitions/comments" }
                , "group"   : { "$ref": "#/definitions/group"    }
                }
          , "additionalProperties": false
          },

      "exercise": { "type": "string" },

      "version" : { "type": "string" },

      "comments":
          { "type"    : "array"
          , "items"   : { "type": "string" }
          , "minItems": 1
          },

      "group":
          { "type"    : "array"
          , "items"   : { "$ref": "#/definitions/testItem" }
          , "minItems": 1
          },

      "testItem":
          { "oneOf":
                [ { "$ref": "#/definitions/singleTest"   }
                , { "$ref": "#/definitions/labeledGroup" }
                ]
          },

      "singleTest":
          { "type"                 : "object"
          , "minProperties"        : 1
          , "maxProperties"        : 1
          , "additionalProperties" : { "$ref": "#/definitions/testData" }
          },

      "testData":
          { "type"      : "object"
          , "required"  : ["description"]
          , "properties":
                { "description": { "$ref": "#/definitions/description" }
                }
          },

      "description": { "type":"string" },

      "labeledGroup":
          { "type"      : "object"
          , "required"  : ["description", "group"]
          , "properties":
                { "description": { "$ref": "#/definitions/description" }
                , "group"      : { "$ref": "#/definitions/group"       }
                }
          , "additionalProperties": false
          }
   }
}

I know this is far from perfect, and some people where expecting a more rigid test schema to allow a fully automated test suite generation. But I believe this is better than nothing.

Also, it is ready to use and seems to work as expected in my preliminary tests:

`foobar` test run

foobar-0.1.0
  foo the void
  bar the void
  snafu
    foo and bar

Finished in 0.0001 seconds
3 examples, 0 failures

Does anyone have anything to say about it?

Edit: There is also a ported bowling/canonical-data.json here as an example.

petertseng commented 7 years ago

Hello, in case any wonders why the description is the only required key... In particular, if any wonders why expected is not a required key:

expected might not work so well with some exercises.

I give you:

About the schema:

Consider a JSON file following this schema. How easy is it for a parser to determine the difference between a singleTest and a labeledGroup? Given an object appearing in a group array, how will I be able to know which of the two it is? It was not immediately obvious to me, but maybe it is.

consider:

      "foo":{
        "description":"foo the void",
        "input":"",
        "expected":"foo"
      }

That "foo": key: what will it be used for? Is it just the description? does that make description unnecessary?

rbasso commented 7 years ago

Given an object appearing in a group array, how will I be able to know which of the two it is? It was not immediately obvious to me, but maybe it is.

Both, the group and the singleTest are objects, but we can easily know which is which:

If the object only contains one property and the value is a testData, it is a singleTest.
If the object contains only two properties, description as string and group as an array of testItem, it is a group.
otherwise, it is invalid.

We could solve this problem adding some verbosity to the specification, but I'll discuss that in another message that I'll probably finish writing in a few hours.

That "foo": key: what will it be used for? Is it just the description? does that make description unnecessary?

That foo, which I informally call "the test type", is fundamental the tell apart different types of tests that could have exactly the same properties, as in this example:

{
  "test":{
    "description":"Qwerty encoding",
    "plaintext":"Sample plaintext",
    "ciphertext":"adsdfsjqwreiugi"
  }
},
{
  "test":{
    "description":"Qwerty decoding",
    "ciphertext":"adsdfsjqwreiugi",
    "plaintext":"sampleplaintext"
  }
}

There is no easy way to say which one is a encoding test. I'll write more about the options to solve this in my next message.

rbasso commented 7 years ago

Let's say we have an exercises in which the user has to implement two functions:

foo, receives a string and appends "foo" to it.
bar, receives a string and appends "bar" to it.

The test suite would normally consist of multiple tests for foo and bar, maybe mixed in a list, so we need a way to distinguish these two test types:

[ { "description": "How is the codebase?"
  , "input"      : "fu"
  , "expected"   : "fubar"
  }
, { "description": "A martial art."
  , "input"      : "Kung-"
  , "expected"   : "Kung-foo"
  }
, { "description": "Where do you live?"
  , "input"      : ""
  , "expected"   : "bar"
  }
]

Humans can easily see that the first and third tests appear to simply call the function foo with input, while the second test the function bar. Let's give names to these test types:

justFooIt
justBarIt

In my last proposal, we would avoid ambiguity like this:

[ { "justBarIt": { "description": "How is the codebase?"
                 , "input"      : "fu"
                 , "expected"   : "fubar"
                 }
  }
, { "justFooIt": { "description": "A martial art."
                 , "input"      : "Kung-"
                 , "expected"   : "Kung-foo"
                 }
  }
, { "justBarIt": { "description": "Where do you live?"
                 , "input"      : ""
                 , "expected"   : "bar"
                 }
  }
]

This would allow the parser to easily identify each kind of test.

After pondering about it for a while, I think it would be probably better to change to this structure:

[ { "description": "How is the codebase?"
  , "justBarIt"  : { "input"   : "fu"
                   , "expected": "fubar"
                   }
  }
, { "description": "A martial art."
  , "justFooIt"  : { "input"   : "Kung-"
                   , "expected": "Kung-foo"
                   }
  }
, { "description": "Where do you live?"
  , "justBarIt"  : { "input"   : ""
                   , "expected": "bar"
                   }
  }
]

Let's write the full canonical-data.json for it, so that we can see how it looks:

{ "exercise": "foobar"
, "version" : "0.1.0"
, "comments":
    [ "This is just"
    , "an example"
    ]
, "group":
    [ { "description": "How is the codebase?"
      , "justBarIt"  : { "input"   : "fu"
                       , "expected": "fubar"
                       }
      }
    , { "description": "A martial art."
      , "justFooIt"  : { "input"   : "Kung-"
                       , "expected": "Kung-foo"
                       }
      }
    , { "description": "Where do you live?"
      , "justBarIt"  : { "input"   : ""
                       , "expected": "bar"
                       }
      }
    ]

With this new structure, I think the JSON Schema would be simpler and at the same time we would be capturing more structure.

I'll try to rewrite the schema with this change as soon as possible.

One question: Should we rename group to tests?

petertseng commented 7 years ago

I have the feeling that if I had read the example bowling file I would have understood what the foo (test type) is for, but now it is clear. No objections here. And in fact, doing it this way may fit well with how https://github.com/exercism/x-common/blob/master/exercises/react/canonical-data.json and https://github.com/exercism/x-common/blob/master/exercises/circular-buffer/canonical-data.json operate! Very interesting.

Should we rename group to tests?

I could too ask about using the existing name of cases. However, either cases or tests has the following advantage of group: they answer the question "group of what?" that someone might ask if they just see group.

rbasso commented 7 years ago

I also prefer cases or tests over group! 👍

The only reason I didn't considered cases before was because I thought it could be misleading when used with groups of tests. tests sounded more neutral regarding what is inside, while cases suggests that what is inside are individual test cases.

But that is just a feeling I had. What do you think about it?

rbasso commented 7 years ago

Sample file using `cases`

{ "exercise": "foobar"
, "version" : "0.1.0"
, "comments":
    [ "These is just"
    , "a comment."
    ]
, "cases":
    [ { "description": "Appending to non-empty strings"
      , "cases":
          [ { "description": "How is the codebase?"
            , "justBarIt": { "input"   : "fu"
                           , "expected": "fubar"
                           }
            }
          , { "description": "A martial art"
            , "justFooIt": { "input"   : "Kung-"
                           , "expected": "Kung-foo"
                           }
            }
          ]
      }
    , { "description": "Appending to empty strings"
      , "cases":
          [ { "description": "Where do you live?"
            , "justBarIt": { "input"   : ""
                           , "expected": "bar"
                           }
            }
          , { "description": "Undescriptive variable name"
            , "justFooIt": { "input"   : ""
                           , "expected": "foo"
                           }
            }
          ]
      }
    ]
}

Sample file using `tests`

{ "exercise": "foobar"
, "version" : "0.1.0"
, "comments":
    [ "These is just"
    , "a comment."
    ]
, "tests":
    [ { "description": "Appending to non-empty strings"
      , "tests":
          [ { "description": "How is the codebase?"
            , "justBarIt": { "input"   : "fu"
                           , "expected": "fubar"
                           }
            }
          , { "description": "A martial art"
            , "justFooIt": { "input"   : "Kung-"
                           , "expected": "Kung-foo"
                           }
            }
          ]
      }
    , { "description": "Appending to empty strings"
      , "tests":
          [ { "description": "Where do you live?"
            , "justBarIt": { "input"   : ""
                           , "expected": "bar"
                           }
            }
          , { "description": "Undescriptive variable name"
            , "justFooIt": { "input"   : ""
                           , "expected": "foo"
                           }
            }
          ]
      }
    ]
}

rbasso commented 7 years ago

Which one is better?

rbasso commented 7 years ago

A JSON Schema for `canonical-data.json files`

Changes in this new draft:

Add more structure to the version property, enforcing the format major.minor.patch.
Add more structure to the exercise property, enforcing that it is in kebab-case.

Questions:

Should we change the properties named tests to cases, as sugested by @petertseng here?
Should we standardize a case for naming properties?
Should we allow comments in test groups or in tests. Are top level comments enough?
Should we prefix/suffix/something the test types to avoid ambiguity with description and tests? This would allow us to restrict the test types with a regex, but would sacrifice readability.
Should we use a fixed named object to keep the test data and move the test type inside of it? This would solve the ambiguity problem, but also sacrifice readability.
Does anyone have a critic/suggestion?

{
   "comments":
   [ " This is a JSON Schema for 'canonical-data.json' files.     "
   , "                                                            "
   , " It enforces just a general structure for all exercises,    "
   , " without specifying how the test data should be organized   "
   , " for each type of test.                                     "
   , "                                                            "
   , " There is also no restriction on how to name the 'testData' "
   , " objects in 'labeledTestItem' yet, but it is advisable to   "
   , " follow a reasonable convention:                            "
   , "  - 'fooBar'  -- lowerCamelCase (used by Google)            "
   , "  - 'FooBar'  -- UpperCamelCase                             "
   , "  - 'foo-bar' -- kebab-case                                 "
   , "  - 'foo_bar' -- snake_case                                 "
   , "                                                            "
   , " Because we cannot use negative lookahead in JSON Schema's  "
   , " regular expressions, it seems very impractical to use a    "
   , " regex in 'patternProperties' to match a test type name in  "
   , " 'labeledTestItem' without also maching the strings         "
   , " 'description' and 'tests'. This prevents us from enforcing "
   , " good naming practices automatically.                       "
   ],

   "$schema": "http://json-schema.org/draft-04/schema#",
   "$ref"   : "#/definitions/canonicalData",

   "definitions":{

      "canonicalData":
          { "description": "This is the top-level file structure"
          , "type"       : "object"
          , "required"   : ["exercise" , "version" , "tests"]
          , "properties" :
                { "exercise": { "$ref": "#/definitions/exercise"  }
                , "version" : { "$ref": "#/definitions/version"   }
                , "comments": { "$ref": "#/definitions/comments"  }
                , "tests"   : { "$ref": "#/definitions/testGroup" }
                }
          , "additionalProperties": false
          },

      "exercise": { "description": "Exercise's slug (kebab-case)"
                  , "type"       : "string"
                  , "pattern"    : "^[a-z]+(-[a-z]+)*$"
                  },

      "version" :
          { "description" : "Semantic versioning: MAJOR.MINOR.PATCH"
          , "type"        : "string"
          , "pattern"     : "^(0|[1-9][0-9]*)(\\.(0|[1-9][0-9]*)){2}$"
          },

      "comments":
          { "description": "An array of string to fake multi-line comments"
          , "type"       : "array"
          , "items"      : { "type": "string" }
          , "minItems"   : 1
          },

      "testGroup":
          { "description": "An array of labeled test items"
          , "type"       : "array"
          , "items"      : { "$ref": "#/definitions/labeledTestItem" }
          , "minItems"   : 1
          },

      "labeledTestItem":
          { "description": "A single test or group of tests with a description"
          , "type"       : "object"
          , "required"   : ["description"]
          , "properties" :
                { "description": { "$ref": "#/definitions/description" }
                , "tests"      : { "$ref": "#/definitions/testGroup"   }
                }
          , "additionalProperties": { "$ref": "#/definitions/testData" }
          , "minProperties"       : 2
          , "maxProperties"       : 2
          },

      "description" :
          { "description": "A short, clear, one-line description"
          , "type"       : "string"
          },

      "testData":
          { "description": "A free-form object with data for a single test"
          , "type"       : "object"
          }

    }
}

Edit: Just wrote an "improved" version that uses negative-lookahead to restrict the test type to camelCase here. I guess it is OK to use some non-mandatory features from JSON Schema.

rbasso commented 7 years ago

I tried to revive this issue last week but, except ~~from~~ for a few comments from @abo64, @rpottsoh and @petertseng, it appears that this issue is still not getting much attention since 2016-11-22.

This is mostly my fault, because I cluttered it with huge posts, making it really hard for anyone to catch up with the history. Also, the subject is really technical, and most of the people that where interested in the subject of automatically generating test seems to have gave up on the discussion, which is unfortunate.

There is little hope of standardizing something as important as the canonical-data.json files without widespread support, as this change would greatly affect all the tracks - hopefully in a positive way - specially the ones using generators.

We have to decide how to proceed here to increase chances of getting something done. Some ideias:

Open a new issue with the last proposal and a nice explanation of what's going on.
Do it here, mention everyone and hope for the best.
Close this issue because we lack enough support for the change.

I'll mention @kytrinyx here because this standardization seems kind of central to x-common's organization. Also, assuming that she is reading this, let me ask:

Would it be useful to incorporate `metadata.yml` data in the `canonical-data.json` schema?

stkent commented 7 years ago

Whatever the outcome, I'd like to note that the current proposal appears at first glance a lot more complex than any individual canonical data set I've used to build an exercise. It's pretty intimidating as I think about tackling some of those new "add canonical data for this exercise" issues.

stkent commented 7 years ago

I do appreciate the examples when those are provided alongside the specifications though - that makes understanding the spec a lot easier!

rbasso commented 7 years ago

Thanks for the feedback, @stkent.

... I'd like to note that the current proposal appears at first glance a lot more complex than any individual canonical data set I've used to build an exercise.

I agree, but most of the complexity comes from what we already have in x-common, and making the specification simpler would remove some features and make some test suites significantly less documented.

I do appreciate the examples when those are provided alongside the specifications though - that makes understanding the spec a lot easier!

I'm glad you said that. Here is a simpler example of a schema-complaint test suite:

{
  "exercise":"foobar",
  "version":"0.0.0",
  "tests":[
    {
      "description":"How is the codebase?",
      "bar":{
        "input"   : "fu",
        "expected": "fubar"
      }
    },
    {
      "description": "A martial art",
      "foo":{
        "input"   : "Kung-",
        "expected": "Kung-foo"
      }
    },
    {
      "description": "Where do you live?",
      "bar":{
        "input"   : "",
        "expected": "bar"
      }
    },
    {
      "description": "Undescriptive variable name",
      "foo":{
        "input"   : "",
        "expected": "foo"
      }
    }
  ]
}

I tried to design the schema to make simple test suites easy to write, while making complex test suites still possible. Of course, there is a significant sacrifice in readability to make the JSON reasonably "parseable" and the schema minimally rational.

I'm afraid this is as simple it gets without loosing the flexibility needed to capture our current test. 😔

rbasso commented 7 years ago

I'm afraid this is as simple it gets without loosing the flexibility needed to capture our current test. 😔

It would be possible to collapse the test data with the description, adding a new key to specify the test type. That would remove one nesting level, possibly making it simpler.

I would love to hear what people, specially the ones using test generators, think about it.

rbasso commented 7 years ago

Question: Which one is better? Why?

Test type in the property key

{
  "description":"How is the codebase?",
  "bar":{
    "input"   : "fu",
    "expected": "fubar"
  }
}

Test type in a property value

{
  "description":"How is the codebase?",
  "type"       : "bar",
  "input"      : "fu",
  "expected"   : "fubar"
}

ErikSchierboom commented 7 years ago

I personally like the second one better, as there is only ever one test type, right? Then why have any nesting? Secondly, I also like having the description, type, input and expected values on the same level, as I think a case could be made for them to all be top-level properties (they are equally important).

rbasso commented 7 years ago

Thanks for the feedback, @ErikSchierboom.

While I was waiting for comments, I prepared a new proposal using the flatter structure. I think that you and @stkent will prefer this new version (I think I prefer it too).

Another JSON Schema proposal for 'exercism/x-common/exercises/*/canonical-data.json' files

Changes:

Rename tests to cases, as suggested by @petertseng, here.
Collapse test data and description together, adding a new property named type.
Add support for comments in single tests and test groups.
Add experimental support to incorporate metadata.yml content.

rbasso commented 7 years ago

Yet another JSON Schema for `x-common`

Changes:

Remove cases from list of mandatory properties.
Add blurb to list of mandatory properties.
Add example canonical-data.json without test cases.

With these changes, the canonical-data.json file supersedes metadata.yml, except for a few atipical files - being discussed in #597 - and the source_url property, that was renamed sourceUrl to keep naming consistency.

I'm blindly changing some things here that appear to make sense until I receive more feedback, but this looks like a great opportunity for us to add the properties of metadata.yml in canonical-data.json.

Is there any reason for not doing it?

stkent commented 7 years ago

I think combining could make sense, though in that case I'd almost prefer swapping cases back to tests since the scope of the file is now larger than "just" tests. I'd obviously defer to @kytrinyx on the combo though, since it will ripple out to other areas of the project.

rbasso commented 7 years ago

I think combining could make sense, though in that case I'd almost prefer swapping cases back to tests since the scope of the file is now larger than "just" tests.

Makes perfect sense!

I'd obviously defer to @kytrinyx on the combo though, since it will ripple out to other areas of the project

So let's decide the cases vs tests after the decision about incorporating metadata.yml properties.

kytrinyx commented 7 years ago

Would it be useful to incorporate metadata.yml data in the canonical-data.json schema?

The purpose of the two files is different. One is used to to be able to talk about the exercise, the other is used to be able to produce an implementation. I would hesitate to conflate the two, but am open to discussing it if any of you have strong feelings about it.

rpottsoh commented 7 years ago

@rbasso this is a response to your post regarding Test type in the property key or Test type in a property value. I am split on this particular issue. My initial perception of the first example is that "thing" bar is being tested and it is clear to me what is to be its input and what I should expect to get back from it.

In the second example my initial thought when I see "type" is that somewhere is a list of canned types and this instances is for "bar", whatever that is suppose to mean. Why not instead of "type" could it be called "testof".

I know there has been more discussion on this subject since you made this particular post. Some of these discussions move pretty quickly. 🏃

exercism / problem-specifications

canonical-data.json standardisation discussion (was: Malformed data?) #336

About automatically generated test suites

About readability for humans and software

About the format

The real question

Intro

Test suite structure

Grouping

from acronym

Heterogeneous groups

Heterogeneous group example

Nested grouping

Test case specification

Ambiguous test type example

Using a unique key for each test type

Using a unique key inside a test key

Adding a key-value pair to identify the test

JSON Schema

Proof of concept

Final comments

foobar test run

536 if accepted

Sample file using cases

Sample file using tests

A JSON Schema for canonical-data.json files

Would it be useful to incorporate metadata.yml data in the canonical-data.json schema?

Question: Which one is better? Why?

Test type in the property key

Test type in a property value

Another JSON Schema proposal for 'exercism/x-common/exercises/*/canonical-data.json' files

Yet another JSON Schema for x-common

from `acronym`

Using a unique key inside a `test` key

`foobar` test run

Sample file using `cases`

Sample file using `tests`

A JSON Schema for `canonical-data.json files`

Would it be useful to incorporate `metadata.yml` data in the `canonical-data.json` schema?

Yet another JSON Schema for `x-common`