This issue will be used to collect and discuss kata qualities often argued during the beta process. I'll also create a wiki page later to summarize and to allow the community to help organize the information that they can refer to.

The goal is to have a fundamental quality requirements that Codewars community can refer to when evaluating beta kata. Like a checklist. Note that this won't be a standard that every kata must follow, but more like something you can use to back up the argument. We currently don't have a way to enforce a standard and I also believe we shouldn't go that route. See https://github.com/Codewars/codewars.com/issues/1626#issuecomment-438400435 for my long term plan.

Please help collecting information by posting a quality factor per comment, especially those often argued and any arguments for it. Links to examples from the past discussions on Codewars might be also useful. Keep the comment mostly neutral and add your thoughts after them.

~~Ideally, a quality should be measurable/identifiable, but feel free to post even if it's not so we can discuss and possibly find something from it. If we fail to come to an agreement on something and if it's very important, I'll decide and document it.~~

Let's see how this goes. This should at least give me a better idea how the community views kata quality.

Remember to keep the discussion constructive. I also recommend participating by using reactions (:+1::-1::heart:) even if you don't have anything new to post or just reading this.

Please don't post too many at once to make it easier for others to take a part. I'm also open to suggestions to make this easier for everyone.

┆Issue is synchronized with this Clickup by Unito

We do have this guide which more or less sums up the key points considered these days during Beta:

description is good
there're sample tests
there're fixed tests
there're random tests
the tests correspond to the task
there're no critical issues (reference solution is wrong, or accessible to the user)
the kata is not a duplicate

The first real problem is that different users have different views about these points, or it is unknown to which extent everything has to be bad to become a proper issue which needs a fix. E.g.:

If the description is "correct" but it's very long for no reason, full of grammar mistakes, uses italics and bold text extensively, hence, tedious to read, should an issue be raised, a suggestion to rewrite the description would suffice or should we leave it alone?
At which point does kata become considered a duplicate? Which of these options describes "duplicate issue" more correctly: "the task is identical" (numerous factorial katas), "the task is not an exact copy, but the difference is practically non-existant" (this kata), "the kata is not an exact copy, but the approach is pratically the same" (this kata)? In fact, should we even care about duplicates at all? Voile is strongly against them, while CliffStamp says that duplicates should be approved as "repetition is important in learning", and "if the author is active, and addresses the issues, we should approve it if it's an exact duplicate of another exisiting kata".

The second real problem is that while some users are unhappy about curation process (and while CliffStamp is actively shitposting about PUs being the root of the problem because "they're elitist, downvote/raise issues just for fun, and never fix anything"), nobody seems to care about what the kata authors should do (as it was pointed out by Voile). There's no point in creating guidelines for curators if the the authors of curated katas can freely block users (SteffenVogel - permanently, KenKamau - temporarily), and nothing stops them from closing issues because their opinion is superior to yours (GiacomoSorbi).

Language neutrality

As a keen translator, I wanted to point out some problems with features which are not relevant that much when someone is solving a kata, but they become a problem when someone wants to translate it. Violation of some guidelines might not seem that relevant in the beginning, when only one language version exists, but the degrading effect becomes visible when someone creates a translation into a language which is perfectly able to solve the problem in question, but is not able to express interface or params/return value requirements of initial version. Translators try to work around these in peculiar ways, introducing inconsistencies or just bugs.

Please keep in mind that points below apply mostly to algorithmic katas, and do not relate them with language specific ones.

The best example, I think, are katas which rely on dynamic typing of languages like Python, JavaScript, Ruby, etc. There is quite a few were description says something like (note: just an example, Python is just an example, and dynamic typing is just an example):

Kata title: Find minimum Description: You are given an array. Find minimum value in it. Ignore array elements which are not numbers.

Kata title: Solve equation Description: Parse equation string. Solve the equation. Return array of solutions when any, or string "No solutions* when none.

And now we have three stands, and I've encountered supporters of all of these:

Kata is authored well, author loves Python, so they may freely use all features of the language, and if these requirements comply to author's vision, then they are allowed to compose them in such way. If the kata happens to be compatible with some other language in all/most important aspects, it can be translated. If some language cannot express the requirements, although being pretty much able of solving the problem itself (finding a minimum, finding solutions of equations), it should not be translated. "You want to solve some quadratic equations? But you are not able to return either an array or a string? LoL man, get your java-ish ass outta here, grown-ups do Python here now!" In my opinion this reasoning is not a way to go, because kata should be available for as many languages which are able to solve the problem as possible.
Others say that kata might be better in this regard, but meh. Translators will handle problems somehow, and it's their problem. It's clear, it's solvable, and as long as people can go from kata editor right into browser dev tools and start hacking their JS solution, they are happy. Screw others, who's using C# today, after all? This approach leads to situations where translators introduce inconsistencies between language versions, versions differ sometimes significantly although all of them try to solve the same problem, and other, side problems appear: descriptions like "return error string in JS/Python/Ruby or null in C#/Java or empty array in C/C++ or...", description merge conflicts, etc.
Another group (which I support ;) ) says that task should be created as language neutral as possible from the beginning, and such inconsistencies should be avoided and pointed out to the author. Unfortunately, sometimes it's not possible to create a kata in a way which could be universal for all languages - but still author should make at least some effort to make it as universal as possible. And most common counter-arguments are: -- "Why should a Python kata worry about others? Python is love, Python is life, you want to solve my kata so either convert or GTFO" - and my question is, "why should only Python people be able to solve equations?" -- "but come on, why a Python author should know that this and that is not possible in C and should change [some aspect]?" - well, they do not have to know in the beginning, but since someone points it out in beta, then they know, right? Also, authoring a kata IMO is (should be?) a responsibility and maybe it's just time for author to get out of Python closet and see the world in all its (even brownish) colors? ;)

My bottom line is: a kata (especially an algorithmic kata) should present a problem to be solved and should be made available for as many languages which are able to solve it as possible. Initial version should make effort to not limit them. But maybe I am focused too much on # 3, and # 2 is the way to go?

The problem is that there is neither consensus, nor guideline, which of three points of view presented above should be followed in general. Each has some pros, and some cons, but we do not know whether a kata as defined by CodeWars should be a language practice (then go # 1), or a more general statement of a problem which can be solved with a program (then # 3).

So, what are CodeWars guidelines on this?

@FArekkusu

There's no point in creating guidelines for curators

This is what I originally thought, but I realized that I'm missing a lot of information because I'm not on Codewars as much as I used to. I'm not thinking this will be a solution. I simply want to know what's commonly argued during the beta process.

I want to know more about the first problem. I know users have different views and I've already commented multiple times about it. Those examples are what I'm asking.

@kazk, that'd be everything I've mentioned in my comment (at least I don't think I've missed something important). The nowadays standard among the PUs is:

Sample, fixed, random tests are all present and working correctly.
The description is well-written and corresponds to the task.

The 2 not so clear parts where PUs don't always agree with each other are:

Duplicates, because nobody knows how to determine what is a duplicate, and what is not a duplicate (or to be precise, at which point the kata turns from "similar" to "the same as") - in such bogus cases the issue is usually just closed because "it's not an exact copy of something".
"Boring" katas which are not duplicates, but they are very dull, and facilitate the same 2 or at most 3 functions like map/filter/sort/join which are used in the majority of existing katas, and for some users it makes no sense upvoting/approving basic stuff like sort an array, convert elements to strings, join them.

There are two problems revolving on beta katas:

Best kata practice, as pointed out by @FArekkusu. This is almost indisputable and almost nobody would argue against them, except for times when JohnaWiltink suddenly resolves all the issues about these because he suddenly thinks they are "not important" or "doesn't apply".

However, the gist of the argument comes from the second point:

What is an acceptable task to be made into a kata? What is the minimum topic quality for a beta kata?

This part is where most of the FUD lies in.

Basically, since last year or so, some PUs (mostly me actually) have been advocating for enforcing a minimum topic quality for katas. The main reason is the overwhelming amount of very similar white and yellow katas getting out of hand, which now mostly serves as a brainless grind, along with lots of old ones that are completely unacceptable in even topic quality. (Also, katas straight out plagiarized from other sites. There are lots of them around.) The point is, kata quality is not a formal procedure - if every kata that follows point 1 can be approved, there are basically no curation. It's merely making low-quality content more edible.

There are also ideas that is just not good in a kata format, e.g the typical "do this task without using strings". It's a idea of lost cause, so desperately trying to make it work is kind of wasting the time and effort.

Of course, straight out calling other's beta katas not up to quality is a great fuel starter, especially true for some particularly notorious kata authors. But then there should be a way of telling the users "no, I think your kata has poor topic-quality and it should be scratched". The vote system has always mixed up structural quality (aka making the kata edible) and topic quality (aka making the kata nutritious), and I think they should be separated.

Also, I'd like to point out we definitely need the address the elephant in the room: being able to create beta katas is a privilege, not a right; and just because you created a kata does not mean your effort should be appreciated. For some reasons some users just can't get the simple idea that we're not supposed to hand out participation award to everyone.

Dups is also a major source of FUD, because some kata authors gets incredibly upset when they're being pointed out their kata is a duplicate (which then they try to find all kinds of reasons to invalidate the dup claim). A line has to be drawn somewhere, but it's definitely not going to be "a kata that the exact same code from another kata will not pass is not considered a dup" because this basically means that we can make a kata for a * b + 1, a * b + 2, a * b + 3, ... with disastrous consequences.

However, comments are not suitable for this because either disagreeing users resolve the issue without even addressing the point or sorting out the differences, or casual users just don't care and upvote the kata anyway. I think a separate "this is a dup" vote pool is necessary: to decouple the dup process from kata votes (where lots of users vote blindly) and comments (which does not block anything at all).

Also, again, everyone has the freedom to feel however they want about a kata and downvote the kata for any particular reason. However, there are some particular users who try to police other's downvotes and straight out passive-aggressively label downvoters as trouble-makers. This is completely uncalled of, and should be actively discouraged.

@hobovsky Good point, but I think yours need a separate issue for discussion. We have similar problems on Qualified as well, but it has better description handling and doesn't suffer from merge conflicts. Trying to be language neutral often leads to unidiomatic code and we don't want that. Using only stdin and stdout will be language neutral and that's been used for competitive programming contests and many of the alternatives. But I believe not using them is one of the strengths of Codewars.

Thanks @FArekkusu @Voileexperiments. I was definitely missing those. I'll reread them tomorrow.

It is often very difficult, to impossible to define quality in the type of way that it is being done here, which is why it is rarely done that way. Instead, what can be done is point out exemplars and then note whey they are exemplars. The fact that it is so hard to learn what something is from a definition is fundamental (see Wittgenstein family relations argument, for example on why you can't even define what a chair is in a meaningful way (which would either allow non-chairs, or exclude chairs)).

If a new user could see a bunch of exemplars, ideally of a broad range of katas then it is far more likely they would be able to pick up what quality means, plus it would be of benefit in just doing things like writing structured tests. As it is right now it isn't trivial to do a lot of random testing because there is not builtin's available so there is a LOT of roll your own, even for various trivial things.

As an aside, I would not recommend duplicate katas for repetition for learning, for that, you can just solve the same kata again, ideally in another language to reframe the problem/solution (which is also rewarded). However the critical point is what is a duplicate? This is detailed enough that likely a separate discussion could be made for it.

As an example of a quality kata

https://www.codewars.com/kata/5c06e93aa8af371a56000064

Description

terse
clearly notes the task
specifies edge / boundary cases
effectively uses markup / grammar

Task

novel
solution actually has utility
is a refinement of a brute force problem (reverse . drop . reverse)
focuses on key/critical points of the language (lists as unit, infinite lists, composition, etc.)
has a very elegant solution

Test suite

fixed and random tests
broad coverage
laid out well, easy to follow (this itself is a hornet's nest of opinions)

Author

active in discussions
has actual dialogue
examines translations

However, there are a couple of less than ideal points

test suite can't actually cover the specifications see this invalid solution and comments
the solution is easily google-fu'ed
the kata is very language specific, to the point where other specific translations won't be approved (this is a side effect of the fact that all translations have the same rank and this is way harder/easier in some languages than others, it goes from a white to a purple from Scala to JS for example)

These are in general problematic but they don't have ideal solutions. If they were widespread then they would really be an issue. Imagine if there were an abundance of purple kata in every language that you could google and copy and paste solutions with no understanding of the code, or if all kata had demands/restrictions which were not actually covered in the test suite.

In comparison, this one does almost everything less than ideal

https://www.codewars.com/kata/5773e83810a0a60dca000a1b

It is an exact duplicate of

https://www.codewars.com/kata/merged-string-checker

By duplicate

the description is of the same task, there is no abstraction / application / synthesis
the same code can be utilized trivially (this is a lesser issue)

In comparing the problems, the second (approved one) is superior directly

the description of the beta has no markup, less than ideal variable names
the test suite in the beta allows trivially wrong solutions to pass
author was non-responsive, +year with no action / reply to concerns

And the icing on the cake, the beta has a link to the direct code/solution.

Now again the approved one isn't ideal

there are issues with the test suite (see comments for lack of coverage)
it is trivially google-fu'ed

But the author is responsive, and has indicated they are looking into it, and some edges cases are not always easy to generate at random, and when you have a host of translations and someone points out an issue, it is a massive task to rewrite them all (and the reward system actually penalizes you for doing it).

I'd like to point out that #1365 has updates again.

Also I'd like to expand on this point @FArekkusu as mentioned, because I feel like it's very, extremely, incredibly important:

nobody seems to care about what the kata authors should do

On CW, a kata author takes as much responsibility as a Ronin: basically you can do literally anything to your own katas, be the most annoying, uncooperative and hostile kata author on CW, and still live up to a very good fortune of honor (because nobody can do anything to your dear katas). You can also post the lowest quality of katas all you want, and nobody can stop you. The only resistance here is the auto-retire function, but this assumes that the power users act quick enough.

The experience in plowing through beta and dealing with the most obnoxious katas and kata authors basically taught me a few things:

Lots of kata authors do not fix the issues in their katas, which can be left dangling to... up to 3+ years
Kata authors can behave all they want and to the obnoxious ones, it's practically a yelling contest, which bumps up the FUD and frustration by a lot
Because there are no punitive measures against kata authors who straight out ignores their responsibilities or even counteracts them, in the end the irresponsible kata authors always win.

The articles Creating your first kata and Kata Best Practices exists for a reason. It outlines all the things a kata author should be responsible to, if they choose to become a kata author. If a kata author is hostile to discussions and issues about their katas, how do you expect them to handle inquiries from general users in a decent manner? While inquiries from general users can be frustrating too, now I find some "special" kata authors to be much more frustrating. They do not deserve to be kata authors and should be removed of this privilege. (Or in a better word: "dethrone" them. Preferably also releasing their katas so that they're open for edit once they're dethroned.)

Frankly, at this point I believe that the kata author requirement should not even be honor-based. It should be something like in the form of an entry test: if you pass the test and show that you can handle the responsibility as a kata author, you get the privilege of creating katas. Then if you behave badly/irresponsible and mistreat this privilege, it can be revoked with enough evidence.

Kata creation is a great power that comes from great responsibility. Currently the implied responsibilities are nil. Then of course the kata authors are running around like bandits.

I do agree that the the current beta auto-retire criteria is probably a bit too wide, but... because kata authors often don't take responsibility to their katas, eventually there need be a way to retire katas. It's also the natural result because kata authors will forget about the site some day. Then the kata basically becomes author-less.

However so far I don't see an option of "report this beta kata for considerations of retiring it". Even the most obvious dups converge to ~50-60% satisfaction rate and just stays there forever. There are thousands of orphaned beta katas, and quite a lot of them are among the "should just be retired" tier. The auto-retire function will never get rid of them.

People think that creating/translating a kata is easy

Some people, especially less experienced ones, think that creating a kata is just as easy as solving it. They jump into editor, create a reference solution, and then problems start... Underspecified description, missing edge tests, compatibility issues, and what not... I really think that "create a kata" privillege should be much more difficult to earn, or, ideally, should not be honor based at all. Unfortunately, it's not limited to inexperienced users only, and power users suffer from it just as much as newbies. Dunning-Kruger Effect is a bitch, and users at approx. 4kyu-2kyu are at the peak of it. They know that they know, and if you think they do not know, you are wrong. That's why I think that privileges based on points are bad idea.

Reward for authoring a kata/translation is too big

People treat translations and new katas as source of Honor Points which they are not able to earn by solving them. Either the "Completed Kata" leaderboard, or "Overall rank" ranking, should be THE board, and "Overall" leaderboard (solved katas + authored katas) should be just trashed because it reflects pretty much NOTHING.

Authors and translators are not able to recognize issues

There are some aspects, usually language specific, which are not widely recognized as issues although they are a no-go in "real world" coding and would have to be immediately fixed. It often happens when contributor uses some language/technique just in scope of CodeWars and they are not aware of broader consequences of this particular, flawed approach and how harmful it might be. They do not know how to approach these, and when one of such issues is pointed out to them, they just tell the reporter to GTFO. Often also CodeWars platform does not help, because it leaves so many aspects underspecified and it's not easy to decide whether potentially incorrect approach is valid or not. Most commonly violated practices I noticed are:

Strict floating point comparisons - because authors do not know how co compare floating point numbers
Rounding - because authors do not know how to handle calculation errors for floating point values
Input mutation - some language versions of one kata (for example, Matrix addition) treat the operation of addition as in-place addition, others as copy-and-add, and other just do not care. Aspect of mutation is not specified in task, and is inconsistent between versions - some versions require it, some forbid it, and some just meh on it.
Compilation warnings - unused variables, raw containers in Java, etc.
Missing includes - C and C++ specific, and authors are really persistent with "But header X includes Y, so I do not have to!". They are not able to recognize the problem, when they do recognize it they are not willing to acknowledge it affects their kata, and when they are finally convinced, they do not fix it anyway.
Cross-boundary de/allocations - mostly C specific, but affects also some C++ and NASM katas. Problem is that CW does not even define whether there is a memory boundary between solution and test suite. Is it allowed to malloc memory in a solution, return allocated buffer to a test suite, and free it there? In many set-ups it means crash right away. In CW it seems to works because of how solution is built/linked/executed. Can this behavior be relied on?
Containers passed by value instead of const ref - also C++ specific, and really difficult to convince authors how much of a good practice violation it is, because "hey, it works, so STFU".

And I am not talking about more general problems with kata design, missing specifications, missing test cases... There are much more common design issues which are constantly present in new and old katas.

When issue is identified, authors are not willing to fix it

After all, no one can force them, right? Many people get really defensive when something is pointed out to them, and they are really butthurt when they get negative rating or issue is raised. They treat issues as personal attacks and prefer to cover their ears and do not acknowledge the problem rather than fix it. They treat katas as their property and do not allow for any fixes or improvements. It's especially problematic when some seemingly easy (easy to solve, and easy to author) problem turns out to be more complicated and difficult than it initially seems - authors are really lost then because all they wanted to do was to was to multiply elements of an array, and they do not understand why for such simple problem people raise issues like overflow, input mutation, performance, memory allocation, and others, equally abstract in their opinion. And it does not apply only to "easy" katas, because difficult ones, 3kyu+, have similar issues.

Current rating system is not balanced and it's misused

Having a kata downvoted a few times means it gets immediately retired and it cannot be corrected. Reviewers are for some reason not willing to point out issues and hold back with a vote until they are fixed, or to change the negative vote after issues are resolved. It sometimes happens that kata gets retired and author does not even get a word of feedback and is left clueless.

No easy way to improve existing, accepted katas

When an accepted kata has some identified issues, there's no common agreement how to proceed with fixing these:

There's no agreement how to decide what the common factors should be: should matrix addition be in-place addition, or should it return a copy?
It's not known whether kata behavior can be changed if it means invalidation of many solutions. Would it be OK to fix a long standing issue in Java version of "Make a spiral" kata if it would invalidate 112 solutions?
There's no way to get rid of katas which are recognized and acknowledged as low quality: they are duplicated, or have low satisfaction rating, but they are still there, polluting the site.

There's no medium to resolve ambiguities, doubts and disagreements

Sometimes there is more than one direction to approach a problem, and it could be fixed in a few ways. Users often disagree whether some issue is an issue, if it's important, whether it should be fixed and how it should be fixed when more than way is possible. When a shitstorm ensues on some kata again, there should be some user with distinctive, red, bold, username who steps in and cuts the crap with their definitive, final opinion (rewarding the most engaged sides appropriately). Currently, there's no one to ask, no one to get advice from, no one to complain to.

codewars / codewars.com

Kata Quality Factors #1646