Wording and naming of kata snippets

hobovsky commented 3 years ago

Asked by @DonaldKellett here: https://github.com/codewars/docs/pull/153#pullrequestreview-522156673

[...] may we rename the following terms?

"Full test suite" --> "Submit tests"

"Proposed solution" --> "Complete solution" / "Reference solution"

I agree that better wording could be used, but I have doubts:

"Submit tests" sounds off to me because it sound like activity (submit). But that's probably because my English sucks, so I will rely on you here :)

I do not like "complete solution" because... what would other solutions be? Incomplete?

I do not like "reference solution" because I do not know how to distinguish this term from the solution used by tests to generate expected answer.

Any hints? Maybe I'd need to check how the snippets are actually named in kata editor, and maybe stick to that?

kazk commented 3 years ago

I think we call them "submission test(s)" and "reference solution" at Qualified.

I do not know how to distinguish this term from the solution used by tests to generate expected answer.

Isn't it the same?

hobovsky commented 3 years ago

AFAIK the reference solution snippet is used only when kata or translation is being published to verify if it works, and is not accessible when user solution is submitted. Is it? Can random tests use this snippet to generate expected answer? Am I missing something, or need to get some sleep? :)

kazk commented 3 years ago

I see what you mean now. The one used in random tests is usually just copied from the author's solution because there's no other way. I'd call them both reference solution.

JohanWiltink commented 3 years ago

I have to absolutely disagree with @Kazk here. ( Why does that feel like I'm sticking my neck in a noose? :yum: )

The author's solution and the one used in random tests serve different purposes, can be different, and probably should be different in many kata.

The author's solution, what I would call "example solution", is what goes in the "Complete solution" box in the kata editor. After solving, it's visible as the author's solution in the list of valid solutions. As such, it should exemplify any pointe the kata has, and it should first and foremost be readable.

The solution used to generate correct results in random tests I would call "reference solution". It should be fast, for maximum responsiveness of the kata's testing, and it might sacrifice readability for speed. It should still be as maintainable as can be achieved without sacrificing noticeable speed.

Advantages of having different implementations for example and reference solutions include

both can achieve their own intended purpose better
they can be tested against each other, minimising risk of incorrectness
they might find different edge cases when running them against random inputs

kazk commented 3 years ago

Why does that feel like I'm sticking my neck in a noose?

I don't know why either, but you shouldn't feel that way. There's no point of working together if no one disagrees. Please always feel free to suggest improvements or explain anything we're missing.

I'm used to the term "reference solution" and haven't had issues because it's usually possible to know which one by context. But I can see it's confusing, so I'd welcome better terms that's more clear.

JohanWiltink commented 3 years ago

The terms mostly come up when discussing designing a kata, often with inexperienced authors (!). There is a definite need for separate terms, I am accustomed to using "example solution" and "references solution" as explained above, and I find they work well.

Also, consistency is good. Inexperienced kata authors often are overwhelmed with new and unfamiliar concepts already. If they're not confused by documentation, that can only help.

Blind4Basics commented 3 years ago

My 2 cents:

Through codwars iteself, I'm used to hear/see "reference solution" as the one being used in the test suite for the random tests (the wording seems pretty appropriate for this one, since it's the one enforcing the specifications). The "other one", I'd just call it "the author's/translator's solution", since it's actually just "a solution of one user", like any other user (almost... x) )

kazk commented 3 years ago

@ggorlen Any suggestions for better terms to distinguish the author's solution and the solution used to test against? I think at Qualified, we almost always have the same code for both. But the Codewars community raised some good points on the benefits of having different implementations. I think I've seen "working solution" somewhere for the author's solution, but I don't remember where.

hobovsky commented 3 years ago

Set up a pilot article on the new blog and let's run a survey among users :)

ggorlen commented 3 years ago

At Qualified, the reference solution (yep, we call it that) and the function used to validate random tests (has no name) are nearly always one and the same. It should be readable and fast enough to solve the task. I've never needed separate "readability" and "performance" versions, but a lot of migrated CW kata that we've used at Qualified often have reference solutions that are pretty much code golf, so I de-obfuscate these so our customers can understand them.

At CW, there's additionally the author's solution that, like B4B says, is pretty much just another user's solution. We don't have such a thing at Qualified and I'm not sure what to call it as long as it's clear that if a user wants to golf it as their personal submission, that's fine, but they should keep the "reference" solutions and anything that winds up in the testing code readable (I know we've stated this multiple times in the docs so far).

kazk commented 3 years ago

We should first understand (and agree on) the purposes of the two. Starting to get confused. I agree that we need some way to clearly describe these two, but not sure if it's necessary to maintain two versions.

I haven't had the time to solve/review kata on Codewars recently, so I'd appreciate if you guys can help me understand your points better.

The author's solution, what I would call "example solution", is what goes in the "Complete solution" box in the kata editor. After solving, it's visible as the author's solution in the list of valid solutions. As such, it should exemplify any pointe the kata has, and it should first and foremost be readable.

I can see this is valuable in early beta to understand the author's intention. But it's not much once there are more solutions and the author's solution no longer stands out. Or do you mean it's beneficial to have this "example" in the kata editor?

There are other ways to achieve the purpose of showing the author's intention without increasing the cost of maintenance. For example, the author can submit another version or explain in a spoiler comment. This also prevents their "example" from being overwritten when the "example" solution needs to be changed in the future.

The solution used to generate correct results in random tests I would call "reference solution". It should be fast, for maximum responsiveness of the kata's testing, and it might sacrifice readability for speed. It should still be as maintainable as can be achieved without sacrificing noticeable speed.

I agree that the "reference" solution should be fast as possible to minimize the impact on the test duration.

Still, the "reference" solution must be maintainable because it must be maintained over time. So, the authors should clearly document it if it's doing something difficult to understand for speed.

Advantages of having different implementations for example and reference solutionsinclude

both can achieve their own intended purpose better

How often do you need a "reference" solution that must be different from "example" solution?

One thing that I had missed was that having a fast "example" solution can raise the bar unintentionally (I think it was @hobovsky that mentioned this somewhere).

they can be tested against each other, minimising risk of incorrectness

they might find different edge cases when running them against random inputs

Yes, but is it really worth requiring future maintainers to maintain two versions over time? Also, this overlaps with beta testing. I can see how it's useful while initial development (write the straightforward version first, then optimize it while ensuring the same outputs), but is it worth maintaining both forever?

At Qualified, we haven't had the need to have two versions, so it'll be helpful if we can see some examples.

hobovsky commented 3 years ago

My point of view :

I like names "example solution" for the kata snippet, and "reference solution" for the part of tests.

example solution

In the beginning, i wanted it to be meaningful. I wanted to serve it as as example of kata requirements and as a kind of "minimal acceptable user solution". I've seen it as potentially valuable to translators and maintainers, so they would have yet another place presenting a baseline for requirements, difficulty, etc. Then FArekkusu came in with #177 and disliked the idea, saying that the example solution should be meaningless and all its documentation value is redundant because all requirements should be expressed by description and tests. You can read #177, my answer to him here: https://github.com/codewars/docs/issues/177#issuecomment-731619986 and history of the file (for example here ) to see what was my initial intent of meaning for this snippet.

Currently I have no opinion whether the example solution should be meaningful or meaningless. I would like to know what you think.

reference solution

I am a big fan of not having this one at all whenever possible, because it's often not necessary and causes problems. I like to have controlled generators which create randomly generated test cases with the answer known upfront, without running the solution. But it's not always possible.

When it's present, a reason to have it different from example solution is:

take advantage of the fact that you know where it's used and leave out parts of solution not needed there. For example leave out validation if you know you will feed it only with valid inputs, or something like this.
it has to be fast to not eat up too much time of the quota available for user solution. If the example solution would be "the slowest allowed", then tests could not use the same approach.

Example: generate primes up to N, easy version.

Example solution, just like user solution, would be allowed to iterate over 1...n and check every candidate with simple O(sqrt n) trial division. It would be allowed to run for 10 seconds to get the answer for all tests.

However, reference solution in tests, to get the reference answer faster, would use for example memoisation and some better test, idk, maybe 6n +/- 1 one.

maintainability

I do not like to trade maintainability for anything, working with code of other people for years has learnt me to appreciate the aspect of maintainability :) definitely no golfing, and full consideration of the quality of code in both parts are IMO a must.

JohanWiltink commented 3 years ago

not sure if it's necessary to maintain two versions

How else will you ensure an author can solve their own kata ( which, incidentally, ensures a kata is solvable at all ) ?

How often do you need a "reference" solution that must be different from "example" solution?

Not always, maybe not even often. But when a kata has performance restrictions, a reference solution should be as fast as possible, whereas an example solution can show what is enough. When a kata has other restrictions, e.g. "no native sort - implement it yourself", an author could use native sort as the reference solution. ( I generally don't like those kata, but that's neither here nor there. )

My Longest Common Subsequence kata, specifically the Haskell version, is a good example of different example and reference solutions, with different readability and even maintainability. It is a performance kata - a naive implementation is intended to time out - but you don't need Mach 5 to pass. You don't need memoisation or dynamic programming; algorithmic optimisations are enough, if you do it right. Memoisation in Haskell is hard, and dynamic programming is just generally hard, at least conceptually. The reference solution uses memoisation ( which I would have trouble maintaining in Haskell ); the example solution showcases how you can pass the specified tests with an optimised algorithm, and is much more readable, though not as efficient, as the reference solution. The example solution would shred to pieces long before reaching hypersonic speed, should that much speed actually be required.

Generally, I like to write my example solutions in a functional style. My reference solutions may exploit mutability for speed, and may even be written in an imperative style. Without overgeneralising, this may in specific instances be a trade of maintainability for speed, because the functional code may be provably correct, may have less code, may generally be better code than the imperative stuff.

JohanWiltink commented 3 years ago

Would it be possible to not require the example solution after approval? You can go through development and Beta with a separate example solution, but after approval, the example solution is not needed anymore?

I haven't thought out exactly how this would work for translations, before and after approval, but that would let you not need to maintain two solutions indefinitely.

hobovsky commented 3 years ago

Would it be possible to not require the example solution after approval? You can go through development and Beta with a separate example solution, but after approval, the example solution is not needed anymore?

I haven't thought out exactly how this would work for translations, before and after approval, but that would let you not need to maintain two solutions indefinitely.

Remember that example solution is needed not only on approval, but every time a kata/translation is republished. It's not run only one time ever, it's run every time you use the kata editor.

JohanWiltink commented 3 years ago

I have a kata where the reference solution can never be the example solution.

This is a Haskell kata. Haskell has typeclasses. The user solution can define a new instance for a certain typeclass that is used in the tests, but the reference solution cannot do that - defining instances can only be done once. So the reference solution just has to take another path to reach the same goal.

Even if the tests define it only if the user doesn't, the user solution would have access to it and could use it to solve the kata without itself doing the work.

It's a very specific example, dependent on quite specific language behaviour, but it's real. You cannot have a reference solution do certain things the example solution can do: Moving all zeroes to the end.

codewars / docs