codewars / codewars.com

Issue tracker for Codewars
https://www.codewars.com
BSD 2-Clause "Simplified" License
2.09k stars 220 forks source link

Requirement for beta kata approval needs to be more transparent #997

Open Voileexperiments opened 7 years ago

Voileexperiments commented 7 years ago

I constantly get asked by people I know "when will my beta kata get approved?", and honestly, not even I know either.

I think a comprehensive list of "score required to make a kata able to be approved" along the table that shows all honor sources and privileges would be helpful to pretty much everyone, especially power users. There are several thing we'd really want to know:

  1. How many score (upvotes) are required before I can approve a kata?
  2. How does 'somewhat satisfied' and 'not satisfied' affect this process? Because I've seen katas like this which has way too many solves and upvotes and nevertheless is still showing "Testing & feedback needed".
  3. How does translations affect kata approval? Because again, while I appreciate GiacomoSorbi 's effort to translate lots of beta katas in many languages, approving them seem to just make a kata impossible to move past beta, which makes some of us joke that 'don't approve translations before your kata is approved'.
  4. How much extent does the number of already published number of katas affect kata approval? I've seen a table about it somewhere on CW or here (possibly by @kazk), but I couldn't find it again anymore. Having this information handy would be really helpful.

┆Issue is synchronized with this Clickup by Unito

ghost commented 7 years ago

agreed, i hated the process because it's not clear at all, i think it should be automated

kazk commented 7 years ago

@Voileexperiments

I've seen a table about it somewhere on CW or here (possibly by @kazk), but I couldn't find it again anymore. Having this information handy would be really helpful.

The following table is from https://github.com/Codewars/codewars.com/issues/866#issuecomment-288581775. Note that I never got a reply from @jhoffner if this is correct interpretation, so this might be wrong.

Level Satisfaction % Required Votes by # of Approved Kata
<10 <20 <30 30+
White 90+ 12 10 8 6
Yellow 85+ 10 8 7 5
Blue 75+ 8 6 5 4
Purple 70+ 5 4 3 3

Based on https://github.com/Codewars/codewars.com/issues/757#issuecomment-270100977 and rounding to nearest integer.


The starting point of the new documentation is awaiting for @jhoffner's review and licensing. We can add more information about the beta process there.

ghost commented 7 years ago

@kazk this should be implemented as an algorithm

jhoffner commented 7 years ago

How many score (upvotes) are required before I can approve a kata?

This value is dynamic based off of overall ranking feedback and what voting privileges someone has. You can have up to 4x voting power. @kazk's chart looks to be correct.

How does 'somewhat satisfied' and 'not satisfied' affect this process? Because I've seen katas like this which has way too many solves and upvotes and nevertheless is still showing "Testing & feedback needed".

It only affects the satisfaction score. If a score is too low, a kata cannot be approved. The limits here are also based off of ranking feedback.

How does translations affect kata approval? Because again, while I appreciate @GiacomoSorbi 's effort to translate lots of beta katas in many languages, approving them seem to just make a kata impossible to move past beta, which makes some of us joke that 'don't approve translations before your kata is approved'.

If anything translations should help the process, because more languages means more users solving which means more feedback/voting. How have you seen translations hurting the process? You will note g964's kata tend to come out of beta very fast and it's because he translates his kata into so many languages.

How much extent does the number of already published number of katas affect kata approval? I've seen a table about it somewhere on CW or here (possibly by @kazk), but I couldn't find it again anymore. Having this information handy would be really helpful.

@kazk's chart above shows. The more kata you have approved, the less restrictive the system is on approvals since you have shown (in theory) that you consistently produce quality content.

Voileexperiments commented 7 years ago

@kazk @jhoffner Thanks for the information. However, based on my actual observation the above system is not working quite well in practice:

So, I think maybe the required percentage should be reversed (75/80/85/90 instead of 90/85/80/75)? Because based on actual statistics, clearly white/yellow katas have lower overall score percentage than blue/purple katas.

kazk commented 7 years ago

@jhoffner

How have you seen translations hurting the process?

When the translations are low-quality and the translator isn't responsive. No notification, too busy, no longer active, etc. This is obvious, but the damage is catastrophic when one translator produces multiple low-quality translations and doesn't maintain them. g964's kata doesn't suffer from this because the author is the translator for most of the languages and maintains well.


@Voileexperiments

In practice, white katas usually stay around 80% score even for good ones

The satisfaction % tends to drop after the approval regardless of the color and usually the easier ones drop more, but this is irrelevant when discussing about approval process.

Do you think a 8 Kyu kata with 75% satisfaction rating should be approved? Note that this number is during beta and votes are mostly from "power users".

I agree that the current system isn't ideal, but I haven't seen a proposal that's realistic to implement and works better.

Voileexperiments commented 7 years ago

Do you think a 8 Kyu kata with 75% satisfaction rating should be approved?

Ideally the one who approves the kata (i.e power user) should check the obvious things about the kata before approving. But it seems that some people approve beta katas without even first solving said kata :V The approval process is not as light as many of us thinks, and when approving a kata we really should be cautious and careful.

ghost commented 7 years ago

it's easy to fix -> have a threshold satisfaction and threshold number of upvotes -> if both thresholds are met approve -> if a kata or a translation fall beneath put into beta (if translation just put the translation in beta, if that's not possible then delete the translation)

10XL commented 7 years ago

Shouldn't the percentage that votes affect satisfaction rating scale like points? Meaning someone with 4x vote power would affect the satisfaction rating 4x.

For example, a new user's(1x vote power) negative vote affects the satisfaction rating by 25% of a user with 4x vote power. So if a kata that has a 100% satisfaction rating with 9 votes gets a negative vote from me, it will have 10 votes and a 90% satisfaction rating. If instead the kata received a negative vote from a new user, it would have 10 votes and a 97.5% satisfaction rating since he would have 4x less vote power than I have.

DonaldKellett commented 7 years ago

With regards to beta Kata approval, I would like to bring this suggestion back into the light. The system as described in the proposal should minimize the chances of an irresponsible power user / moderator approving a substandard Kata that is overrated by "new" users.

DonaldKellett commented 7 years ago

it's easy to fix -> have a threshold satisfaction and threshold number of upvotes -> if both thresholds are met approve -> if a kata or a translation fall beneath put into beta (if translation just put the translation in beta, if that's not possible then delete the translation)

Automating the process seems like a viable idea. @jhoffner perhaps this should be considered in the future. Only thing to note though if the process really does become automated - the thresholds for "automatic approval" would need to be even higher, e.g. 95%/90%/85%/80% and the minimum number of upvotes would have to at least double. We don't want the system to "objectively" approve a substandard Kata that just happened to receive an influx of upvotes from inexperienced Codewars users subjectively voting on the Kata based solely on their enjoyment/satisfaction of solving it and not on the quality.

@ericswenson15 As for putting a Kata back into Beta when the satisfaction score drops below the threshold, consider the impacts it would have on the users' ranks and honor points. If a high-kyu approved Kata is put back into Beta by the system then the users who have solved it would immediately lose the huge amounts of rank and honor they gained when solving it as an approved Kata which would lead to a large number of frustrated and confused users.

ghost commented 7 years ago

true it would impact their rank and honor negatively, they would drop in honor for sure. very necessary if they write a kata that doesnt meet the standards of other users on codewars

ghost commented 7 years ago

this group would not be large, and is currently greatly outnumbered by those experiencing frustration with the approval process. also it's totally necessary, if they write a kata which used to be good enough and now no longer is, then their honor should be docked, and restored if they get it back out of beta. it's idea capitalism, survival of the fittest

ghost commented 7 years ago

if enough people enjoy the kata then that's a good reason to have it approved. it's not all about production level quality, there is also idea exploration, fun, exploration of libraries, and other things which should be approved if enough users think it's good

DonaldKellett commented 7 years ago

I was talking about the rank and honor of the users who have solved the Kata not just the author. Consider how the average solver would feel if he/she suddenly lost a lot of rank and honor just because a Kata they solved was put back into Beta (which would entirely be the author's fault, not the solver's)?

DonaldKellett commented 7 years ago

they probably won't notice

I can prove from experience that that's certainly not the case. Even though I currently have like 25k+ honor, I most certainly notice it when it drops from something like 25750 to 25748.

ghost commented 7 years ago

they will understand if they've tried to create a kata and if they havent they are relatively much less valuable to the platform

10XL commented 7 years ago

What would be the criteria to approve a kata? Can an automated process catch duplicate katas? How would it evaluate what is a duplicate? Automating the process sounds complicated.

There are already 3425 approved katas, there isn't a shortage. Though there can't be too many high quality katas. I trust the people who have completed a large amount of katas/moderators to have higher standards and more insight than an automatic approval system could. I doubt there is enough time/resources to create such system either.

I agree with @Voileexperiments. There should be a clear standard that katas should meet before approval. Randomized tests is a significant one that I've seen but the other requirements should be outlined somewhere officially, suitably in the new documentation as suggested by @kazk.

ghost commented 7 years ago

duplicates should be handled how youtube handles copyrights, that the original can call out others for copying or being duplicates, but if he or she doesnt then they get left alone

ghost commented 7 years ago

automating is easy, have 2 if statements that checks the number of votes and the ratio of likes to dislikes and if these 2 numbers are above a certain threshold they are approved and if not they are put back into beta

dinglemouse2250 commented 7 years ago

it's not all about production level quality

Disagree. Quality does matter. If you lower the bar then all that will happen is every man and his dog will just publish awful Kata just because they can; and the entire site deteriorates.

On the contrary, I think inactive Kata (Beta or not) with low satisfaction (below some threshold) should be put to sleep. Users who already solved them can keep their points but hibernated Kata should not be visible anymore unless the author updates and re-publishes it.

DonaldKellett commented 7 years ago

On the contrary, I think inactive Kata (Beta or not) with low satisfaction (below some threshold) should be put to sleep.

I think that is already done by occasionally retiring Kata, though I agree that it should perhaps be done a bit more often, especially at the lower levels.

10XL commented 7 years ago

image

Voileexperiments commented 7 years ago

There are some beta katas (and some broken JS katas, see #940 for a comprehensive list) that really should be retired or something.

However, there are also lots of katas that is actually pretty okay, but nevertheless still gets stuck by the dead weight (protip: there are lots of them). Putting them to hibernation is obviously a bad idea, but in the current system they're just going to stay at the status quo forever and there's nothing anyone can do to offset the satisfaction rating by much.

smile67 commented 7 years ago

Thanks @Voile for your comment to my feature request some hours ago. So my question/suggestion once again, i think it's the right place here (@jhoffner): What about an entry like "Waiting for approval" at "Kata Search -> Sort by:"? For example at the moment i completed 1512 beta katas and probably some/a lot of them are ready for approval - but i don't know which. So this point would be helpful to make them visible;-)?!

jhoffner commented 7 years ago

Wow, this is a chatty thread. Kinda hard to extract the useful bits but I'll try to summarize

When the translations are low-quality and the translator isn't responsive. No notification, too busy, no longer active, etc. This is obvious, but the damage is catastrophic when one translator produces multiple low-quality translations and doesn't maintain them. g964's kata doesn't suffer from this because the author is the translator for most of the languages and maintains well.

I can see that, though this just makes me think that a kata should not be able to be approved if there are pending translations. This of course would cause kata to take even longer to come out of beta, which sucks - but it would force pre-approval translations to go through a more proper QA process. Of course translations having their own beta process would be ideal here but it's a lot of work and I don't have the bandwidth yet.

Forcing translation approvals could lead to authors rejecting translations just so they can have their kata approved so IDK if this is a great idea.

but also people that put in "somewhat satisfied" votes (I guess they think the kata is too easy) and then run away forever. These dead weight easily drags down the overall score

I agree, something needs to be put in place to try to separate the idea of them not liking the kata due to issues vs them just not liking the kata. We could try to make the concepts more well defined but I'm sure frustrated users will downvote it due to issues regardless.

Perhaps one change that could be introduced is a special notification for when an issue is closed, prompting users to reconsider their vote if it was a negative one.

automating is easy, have 2 if statements that checks the number of votes and the ratio of likes to dislikes and if these 2 numbers are above a certain threshold

Standards use to be static but was changed. If the standards were static, those standards would be too high and cause higher ranked kata (purple) to take forever to come out of beta. Ideally the standard could be higher for purple/blue kata but it just isn't practical. The system became more complicated over time to counteract beta purgatory while trying to minimize low quality standards. The easy ability to quickly reason about how many more votes are needed was a compromise that was made for this.

As for an automated process - I don't think the actual approval stage is the major bottleneck, and adding a human element to at least give a chance of catching things that can't be determined automatically (or at least not without a ton of new work to do automatically) is better than nothing.

if it falls beneath a certain amount of upvotes it gets put back into beta

Satisfaction score and issue count is enough to indicate that a kata isn't great. However I could see value in sending out a notification to an author that their kata was demoted due to quality issues and that would encourage them to fix it faster. However issues are often for a specific post-approval translated language and I don't see why solvers of other languages should be docked their honor/ranking because of this.

I think this issue is hard to iterate on without improving translations by them having their own beta process - which is ideal but a lot of work and not likely to happen any time soon.

Another feature that would be useful is tracking towards which language an issue was logged against, this of course would only be useful if there was a way to demote a specific language within a kata.

Shouldn't the percentage that votes affect satisfaction rating scale like points? Meaning someone with 4x vote power would affect the satisfaction rating 4x.

IDK maybe but I'm leaning towards no. Having a power user's opinion count more towards getting something out of beta makes sense to me, but having their opinion matter more than others when it comes to "general satisfaction" doesn't. I realize it might not be the most obvious distinction - but for approvals we are relying on a user's CW experience to help us determine if something reaches our standards. Experienced users will have a better understanding of this. The overall satisfaction score however is more gauging overall user enjoyment of the kata regardless of how experienced with CW they are, which makes sense since experienced or not, someone's opinion should matter the same in terms of how much they enjoyed something.

ghost commented 7 years ago

I don't think the actual approval stage is the major bottleneck - is absolutely delusional

ghost commented 7 years ago

totally disagree the human element is the most annoying part, and it's easy to automate just use user satisfaction

ghost commented 7 years ago

adding a human element to at least give a chance of catching things that can't be determined automatically (or at least not without a ton of new work to do automatically) is better than nothing - this is the worst part it's subjective and slow and annoying and frustrating and everyone hates it besides the approvers

jhoffner commented 7 years ago

Unfortunately I had to block eric for obvious reasons that I'm sure everyone else understands. I hate having to block people and don't do so lightly, but I don't think any of us have time for this nonsense.

Edit Note: I have since deleted many of the comments on this thread that made it noisy and unproductive. You can see a screenshot of the original thread here.

To confirm that the rest of us are on the same page. This thread, as I understand it, is about getting to the approval phase of the beta process. The beta process up to that point is the bottleneck (obviously), the approval phase - the phase that once reached requires someone to manually approve a kata, has no need for automation as the community has gotten pretty good at keeping on top of that. We only have 16 kata currently in this phase. This phase is purely a last line of defense to have someone experienced in the community to give the kata one last look over. Unless someone can offer strong evidence as to this part of the phase actually being a bottleneck, I think its most productive to discuss other parts of the process and how we can keep quality high while overcoming some of the issues that cause kata to remain in beta for too long.

As for the true original purpose of this thread, which is to document how the system currently works - this should definitely be included within the new documentation site which is a WIP.

DonaldKellett commented 7 years ago

Unfortunately I had to block eric for obvious reasons that I'm sure everyone else understands.

Good riddance πŸ‘ πŸ‘ πŸ‘ However, next time, before blocking a user, consider reporting him/her directly to GitHub which I have just done.

Voileexperiments commented 7 years ago

Speaking of which, should I open a new issue about (maybe) revising the current beta approval requirements so as to facilitate a more coherent discussion?

kazk commented 7 years ago

I think we should also collect past proposals somehow so that we can build on them and avoid people coming up with already discussed ideas.

jhoffner commented 7 years ago

So, I think maybe the required percentage should be reversed (75/80/85/90 instead of 90/85/80/75)? Because based on actual statistics, clearly white/yellow katas have lower overall score percentage than blue/purple katas.

@Voileexperiments as @kazk pointed out, would you want 75% white kata to be approved? I understand there is the issue with dead votes that stay around after a fix is made, but that withstanding.

I guess what I'm getting at, maybe some kata just aren't good enough right? Is anyone consistently seeing kata lower than say, 85% that are actually really good?

should I open a new issue about (maybe) revising the current beta approval requirements so as to facilitate a more coherent discussion?

Probably a good idea, this is likely to be an ongoing brainstorm.

Voileexperiments commented 7 years ago

@jhoffner As the usual flow that happens for beta katas "bad beta katas lower than 90% is approved quickly" should seldom happen:

  1. For most of the times, new beta katas are solved by regular power user residents first, and not lots of other users. Power users rate accurately AFAIK, meanwhile the dangling votes are from other users. So a bad kata would have an incredibly bad score, while good katas can have any score between 80% to 100%, especially if the kata suffers from early issues (or the kata is 'too easy')
  2. The power users will file issues and suggestions if there are rooms for them. A kata can't be approved with unresolved issues, so unless somebody erroneously resolves all the issues while they're not fixed, the kata will be stuck at beta
  3. We still need a 6k moderator to approve kata. If a moderator don't look at the discourse for possible problems, solve the kata first to go through the kata, check if random tests exists, etc... then obviously the moderator is not doing his job very well
  4. As a last resort, 6k moderators can always edit the kata and fix the stuff themselves. Maybe we can add an "un-approve" functionality there, but as long as the first 3 points are working properly, we shouldn't need to reach this point

Anyways, my original point is, for blue/purple katas only a few votes are required for approval, and nobody except power users will solve them while at beta anyway, so having a lower satisfaction rating requirement than white/yellow katas does not have any noticeable effects. Meanwhile, there are too many white/yellow katas stuck at beta but are nevertheless quite decent due to the high 90% rating requirement. Having lots of dangling katas at beta that has worth but would never be able to be approved seems like occupying everybody's resources to me.

jhoffner commented 7 years ago

I'm not really sure how much what I'm about to say plays into the issues we are discussing, but it feels like "How satisfied are you with this kata and its quality?" needs to change. "Quality" is misleading, at least in the way that it probably comes off. If you have an issue, you should file an issue. Issues can be fixed. If you think the kata is great but has a basic issue that can be fixed, then you should be marking it as "very satisfied" but also filing an issue I would think. However I don't know if anyone would think that way with how it is now. If you think the content of the kata is poor, or it's just so poorly implemented that it would have to be redone from the ground up, then yes I think the satisfaction rating should reflect this.

If you think something is a pretty awesome kata but you just found a small flaw, what would you do in that situation?

I bring this up because as we talk about the adjusting the satisfaction quality bar (which was arbitrarily set in the first place), we are now incorporating real user data but I'm feeling like that data is somewhat tainted by the overloading of the use-case (General Kata Concept quality vs Kata Implementation Qualities that can be easily fixed).

I guess to a degree I feel like lowering the quality limit might make sense, but on the other hand it might just be us sticking on a band-aid to a rating feature that needs to be improved.

A few questions I have, since you guys know the content better than I do these days:

I just kind of spit these out, they are all very subjective data to collect so not sure how much we can rely only a few of us sharing our thoughts, but it would be nice to get a general feel for where we are at content wise (especially with all of the newer content).

I also wonder if something like a simple boolean "Assuming all issues were fixed, would you recommend this kata to a friend?" rating would be better.

Beyond all of this, I also wonder how much of this would just be solved by adding some tweaks here and there to encourage power users to fix more kata. One idea is to make it easier to find kata waiting for issues to be fixed along with a reward for fixing the kata. The main issue there is how much of a rabbit hole would it be to reward power users for fixing a kata and do it in a way where they can't just close an issue and get points without actually doing the work.

Power users rate accurately AFAIK, meanwhile the dangling votes are from other users.

Do the rest of you agree with this observation? If thats the case then maybe it does make sense to incorporate 2x/3x/4x voting into the satisfaction score.

jhoffner commented 7 years ago

Another question for those who have been around long enough to know. Did the somewhat recently added satisfaction score actually improve the quality/process or make it worse in your opinion?

Voileexperiments commented 7 years ago

I'm not really sure how much what I'm about to say plays into the issues we are discussing, but it feels like "How satisfied are you with this kata and its quality?" needs to change. "Quality" is misleading, at least in the way that it probably comes off. If you have an issue, you should file an issue. Issues can be fixed. If you think the kata is great but has a basic issue that can be fixed, then you should be marking it as "very satisfied" but also filing an issue I would think. However I don't know if anyone would think that way with how it is now. If you think the content of the kata is poor, or it's just so poorly implemented that it would have to be redone from the ground up, then yes I think the satisfaction rating should reflect this.

Unfortunately the UI is suggesting the contrary:

image

kazk commented 7 years ago

@jhoffner

it feels like "How satisfied are you with this kata and its quality?" needs to change.

I agree. We need objective feedbacks during beta process for QA and satisfaction is subjective by definition. The minimum criteria of qualified kata that's followed by every participating users needs to be officially defined.

If you think the kata is great but has a basic issue that can be fixed, then you should be marking it as "very satisfied" but also filing an issue I would think. However I don't know if anyone would think that way with how it is now.

I do something like that.

                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Άβ”‚Beta Kata│◀──────┐
       β”‚           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
       β”‚                β”‚            β”‚
       β”‚                β–Ό            β”‚
   β”Œβ”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”
β”Œβ”€β–Άβ”‚Report│◀────────│Broken?β”‚    β”‚Suggest│◀─┐
β”‚  β””β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                       β”‚                   β”‚
β”‚   β”Œβ”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”   β”‚                   β”‚
β”‚   β”‚Rank│◀───│Solveβ”‚β—€β”€β”€β”˜                   β”‚
β”‚   β””β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”˜        β”Œβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      β”‚                 β”Œβ”€β”€β–Άβ”‚ +1 │──▢│Improvable?β”‚
β”‚      β–Ό                 β”‚   β””β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”                β–²
└──│Issue?│──────────▢│Rateβ”‚β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   β””β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”˜
                         β”‚   β”Œβ”€β”€β”€β”€β”
                         └──▢│ -1 β”‚
                             β””β”€β”€β”€β”€β”˜

I usually don't vote "Somewhat Satisfied". Sometimes there're katas that I personally disagree or dislike, but I don't think the author deserves the penalty.


Power users rate accurately AFAIK, meanwhile the dangling votes are from other users.

Do the rest of you agree with this observation? If thats the case then maybe it does make sense to incorporate 2x/3x/4x voting into the satisfaction score.

I agree that power users rate accurately based on their standards at the time. So whenever a power user down votes, the kata usually deserves it. But if "dangling votes" mean negative votes persisting even after the root issues are resolved, I think power users do contribute to them because they usually move on to next kata and don't come back.


I'll try to answer other questions tomorrow after thinking about them little more and if I have something to say.

DonaldKellett commented 7 years ago

A few questions I have, since you guys know the content better than I do these days

Glad you asked, would love to answer your questions and express my opinion :smile:

How often do you think you see good kata (not just kata with no issues, but actually good/useful kata) rated way to low in your opinion? (lets say lower than 80%).

Almost never, especially below 4 kyu. The converse (i.e. bad quality Kata receiving a 90%+ satisfaction rating) happens way too often though. Just a few days ago, I seem to remember having clicked into a Beta Kata that had 100% satisfaction so far and attempted/completed it, only to find out that it was quite lacking in terms of test coverage (I think maybe around 6 fixed assertions in the entire test suite?). Needless to say, I immediately gave it a "Not Satisfied" vote and raised an Issue regarding that. In terms of concept though, I'd say it was fair but not particularly outstanding so my statement still holds true.

How often do you think poorly conceived or just plain useless kata are making it out of beta?

Quite often; I'd say at least 20% of the time. Quite recently, there was a Kata involving returning an array of the first m multiples of n that got approved but I highly doubt that that Kata was the first of its type. And every once in a while (maybe one Kata every week or two on average?) a Kata that lacks both originality and quality gets approved for some reason.

Which is a stronger feeling to you: You feel too many great kata are stuck in beta, or you feel like too many bad kata are making it out of beta?

Definitely the latter. The former still sometimes happens, though. For example, I have an entire "Reflection in PHP" series which is still stuck in Beta and is not gaining new completions lately probably because (1) PHP is not very popular on Codewars and (2) my Kata descriptions tend to be a bit long so many users may not have the patience to read the whole Description, despite the fact that I very clearly separate my "Lesson"s and "Story"s (sic) from the Kata "Task". However, I think we shouldn't worry too much about the former because there is already an abundance of approved high-quality Kata and efforts to attempt to make it easier for Kata to get out of Beta will probably do more harm than good.

Of the kata that you feel like are bad, is it more often that they seem bad due to their underlying concept/implementation being too weak or because they just have too many issues with them in general (but the concept its trying to challenge you on is sound?)

To be honest, I've seen both types of bad Kata a lot of times and at least a third of the time, Kata that are lacking in concept/originality are also very poorly implemented. Such "doubly bad" Kata usually arise from new, inexperienced Codewars users who more often than not complete no more than a few Kata (say, less than 100), decide that they really enjoy this Codewars platform (which is a good thing), and then start experimenting with the Kata Editor by authoring their own Kata. The only thing is that when they experiment with the Editor, they usually either (1) copy ideas that they have seen in Kata they have completed (and thus resulting in a direct duplicate) or (2) author a Kata centred around a well-known computer programming topic (such as FizzBuzz) which again almost guarantees that said Kata is a duplicate. Furthermore, since this is the first time that the new, inexperienced Codewars user has authored a Kata, chances are that he/she is completely unaware of the (poorly documented) rules of the Codewars Community and therefore ends up publishing a Kata that lacks fixed assertions, edge cases and random assertions. Then, when confronted with these Issues by experienced Codewars members, they usually respond emotionally and refuse to fix the problem. Which is understandable, seeing as a new user would think, "Hey, I just spent 20 minutes [experienced users will know that 20 minutes is far from enough for authoring a high-quality Kata] of my time authoring a Kata and contributing to the Community, why are you blaming me instead of thanking me?"

The other (proverbial) two-thirds of all bad Kata are authored by either slightly more experienced Codewars users or power users (yes, it does happen, sometimes). However, at least 95% of more experienced users are well aware of the (poorly documented) rules of the Codewars Community with respect to Kata quality so Kata that are outstanding in terms of content and concept but lacking in implementation (e.g. test coverage) are extremely rare. Usually, the case for a bad Kata is that a power user is out of ideas but still wants to author a Kata for the honor (and potential rank) gain so he/she authors a useless Kata with excellent test coverage and such.

DonaldKellett commented 7 years ago

I'm not really sure how much what I'm about to say plays into the issues we are discussing, but it feels like "How satisfied are you with this kata and its quality?" needs to change. "Quality" is misleading, at least in the way that it probably comes off. If you have an issue, you should file an issue. Issues can be fixed. If you think the kata is great but has a basic issue that can be fixed, then you should be marking it as "very satisfied" but also filing an issue I would think. However I don't know if anyone would think that way with how it is now. If you think the content of the kata is poor, or it's just so poorly implemented that it would have to be redone from the ground up, then yes I think the satisfaction rating should reflect this.

I usually interpret the "quality" of a Kata as how well it is implemented in terms of an unambiguous Kata Description, code coverage etc., and as @Voileexperiments said, that impression is more or less from the UI itself. If I see a Kata with 20 or less unique, distinct fixed assertions and no random assertions, I almost always vote "Not Satisfied" (>99%) without thinking twice and raise an Issue at the same time. Otherwise, I usually vote "Very Satisfied". Sometimes, when a Kata is well implemented but I think the concept is very crappy and I didn't have a good time solving it, I would rate it as "Somewhat Satisfied" but I almost never choose "Not Satisfied" purely due to bad content and not bad implementation.

kazk commented 7 years ago

@jhoffner

Probably a good idea, this is likely to be an ongoing brainstorm.

Can I open a new issue? I'm tired of scrolling unnecessarily :(

I wrote a small program that uses GitHub API to retrieve all comments and filter them. Then outputs each comments prefixed with the user and the link to the original comment in this issue.

@username commented <- "commented" will be a link to the original comment

comment body

I'd like to be able to link to a specific comment, so I'm going to add a feature that posts each comment separately using POST /repos/:owner/:repo/issues/:number/comments.

What do you think? People can continue discussing on this issue while I work on this.

jhoffner commented 7 years ago

Can I open a new issue? I'm tired of scrolling unnecessarily :(

We can, or I can go back and delete all of the drama that happened above. I usually come to this thread from a "view in github" email link so it scrolls for me.

This issue is off-topic from its original purpose so we can give it a shot and see if it's readable.


Thanks for the feedback everyone. It seems like everyone is using the voting a little bit differently (which is expected). In some cases very differently (which is not ideal). I was trying to go back over some of the internal discussions around this feature and it turns out it was all the way back on July 1st, so almost a year. Back then we had the system where during beta we asked "ready, minor issue, major issue" and then after beta you could simply just up vote, and ready votes and up votes were part of the same system. So this change was originally meant to try to structure voting into a single system, with a score that was more useful than up votes, since 100 up votes wasn't necessarily better than 20 up votes (the kata may just have been around longer, and most up votes came from the ready beta voting). Another issue was that if you wanted to downvote a kata, you had to say minor/major issue, but we forced you to specify an issue. A lot of users had the same issue as others, so we ended up with tons and tons of the same issue being filed.

So in retrospect I'm still happy with the satisfaction score relative to how it was before, but I think clearly it's still not good enough.

If I see a Kata with 20 or less unique, distinct fixed assertions and no random assertions, I almost always vote "Not Satisfied" (>99%) without thinking twice and raise an Issue

I totally can understand why you would do this @DonaldKellett and in retrospect to the original reasons for making this change, it doesn't seem to far off from how this feature was intended. I think over time I have forgotten those reasons and the distinctions between satisfaction being more about "would you recommend this to a friend?" type of satisfaction, VS "do you think this kata's test/language quality is good enough?" type of satisfaction have become more apparent (as in there needs to be a bigger UX distinction between them).

Sometimes, when a Kata is well implemented but I think the concept is very crappy and I didn't have a good time solving it, I would rate it as "Somewhat Satisfied"

This is what I'm trying to get at. I personally think that satisfaction score should reflect how many users actually enjoyed the kata and found it valuable, not necessarily just how well it is constructed. If you thought the kata sucked, personally I think you should be giving a somewhat satisfied at best (though we may just want to kill the 3 vote system since "somewhat" is seldomly used anyway).

We need objective feedbacks during beta process for QA and satisfaction is subjective by definition

I agree and this point also illustrates the concept of while in beta vs out of beta. I would still like to find a system that works for both, to keep things simple and consistent - but it's a challenge because the contexts are not the same.

I'm leaning towards getting rid of generic voting all together and asking a few basic questions, that depending on their answer maybe they expand (so I guess a mini exit survey).

The 2nd top level question (recommend to a friend) may not be necessary if "general concept is not useful" makes sense as an issue.

kazk commented 7 years ago

@jhoffner

We can, or I can go back and delete all of the drama that happened above.

Deleting might be good enough, definitely easier. Maybe save the current state (screenshot/save as PDF etc.) for the record?

I usually come to this thread from a "view in github" email link so it scrolls for me.

Yeah, but what about users who hasn't participated yet? Seeing those comments is really discouraging and uncomfortable. People might not even scroll to your comment about blocking him.

I didn't really get the recent "code of conduct" thing, but I understand it better now :( The community can get hurt and it's hard to recover. For example, I think the mood of Gitter channel changed completely and many people just left. I did because some of his off-topic comments offended me and the amount of noise became unbearable. Why discuss political/racial opinions on Codewars channel? :/ Aggressive tone doesn't help :(

I know you're busy and don't have time to monitor how users behave, so I'd recommend coming up with official rules (see also #187) before toxic behaviors kill the community. Maybe something minimum like Ruby's is enough for now.

jhoffner commented 7 years ago

Ugh, well now the thread just has a million deleted comment line items.

I'd recommend coming up with official rules (see also #187) before toxic behaviors kill the community. Maybe something minimum like Ruby's is enough for now.

Yeah I agree. Though the docs section of CW is terrible, so we need to also prioritize getting the new docs site ready. I'm out of town again next week for a workshop, I've got a lot to catch up on still. I want to help get contribute some UI updates to the docs you built so that we can get that out ASAP.

kazk commented 7 years ago

I'll open a new issue focused on discussing about kata QA. Since I haven't figured out how to post comments using the API, I'll just copy & paste ~40 comments manually.

jhoffner commented 7 years ago

It's fine, this thread has been cleaned up, we can start a new one and just reference this one in case anyone feels like getting fully caught up to speed. No need to copy and paste comments (maybe we can cherry pick a summary of the key points?)