codewars / content-issues

Higher level issue tracker for the Codewars content.
15 stars 1 forks source link

Deduplicate - Caesar cipher + ROT13 + Atbash #203

Open hobovsky opened 1 year ago

hobovsky commented 1 year ago

From wiki list

Ceasar Cipher

  1. Caeser Encryption

    • 7 kyu
    • Input is lowercase and some special chars. Encoded chars are lowercase letters. Variable step which can be larger than alphabet size and needs to be modulo'd by users. Characters form outside of allowed alphabet should be passed through unmodified. Only encryption.
    • Satisfaction 81% with ~390 completions
    • 1 pending issue
    • 3 languages, no pending translations
    • Published Mar 2016, author inactive
  2. Dbftbs Djqifs

    • 6 kyu
    • Input is lowercase, uppercase, digits, and some special chars. Encoded chars are lowercase and uppercase letters. Variable step which can be larger than alphabet size or less than 0 and needs to be modulo'd by users. Characters form outside of allowed alphabet should be passed through unmodified. Only encryption.
    • Satisfaction 91% with ~2300 completions.
    • 1 pending issue
    • 5 languages, no pending translations.
    • Published Nov 2014, author inactive.
  3. Caesar Cipher Helper

    • 5 kyu
    • Input is lowercase and uppercase letters, no characters out of alphabet. Shift is variable in range [1, 26]. Solution is a class with encryption and decryption ops.
    • Satisfaction 91% with 3600 completions.
    • 4 pending issues.
    • 5 languages, no pending translations.
    • Published Oct 2013, author inactive.
  4. Weird words

    • 7 kyu
    • Input is lowercase and uppercase letters, spaces, and some punctuation. Encodable chars are only letters. Shift is fixed to be 1. Only encryption.
    • Satisfaction 89% with ~1300 completions.
    • 2 pending issues.
    • 5 languages, no pending translations.
    • Published Aug 2016, author inactive.
  5. Move 10

    • 7 kyu
    • Input is only lowercase letters. Shift is fixed at 10. Encryption only.
    • Satisfaction 92% with 4000+ completions.
    • No pending issues.
    • 13 languages, no pending translations.
    • Published Sep 2016, author active.

ROT13

Like Caesar, but shift is fixed to 13, and decryption is identical to encryption.

  1. ROT13

    • 5 kyu
    • Input is lowercase, uppercase letters, digits, and punctuation. Only letters to be encoded.
    • Satisfaction 89% with 18k+ completions.
    • No pending issues.
    • 14 languages + 7 pending translations.
    • Published Aug 2013, author inactive.
  2. Rot13

    • 5 kyu
    • Input is lowercase, uppercase letters, digits, and punctuation. Only letters to be encoded.
    • Satisfaction 89% with 76k+ completions.
    • 4 pending issues.
    • 16 languages + 7 pending translations.
    • Published Feb 2014, author inactive.

Atbash Cipher

Atbash is not exactly equivalent to Caesar cipher with a shift, but is often seen as equally dull as Caesar. Caesar and Atbash might, or might not, be seen as similar enough to keep only one, or both. It's one of the points to be discussed in this ticket.

  1. Decoding a message

    • 7 kyu
    • Input is lowercase letters and spaces
    • Satisfaction 94% with ~2k completions
    • 1 pending issue
    • 5 languages + 1 pending translation
    • Published Nov 2015, author active.
  2. Emily's Eccentric Encoding

    • 7 kyu
    • Input is uppercase and lowercase letters, spaces, punctuation. Only lowercase letters are to be encoded.
    • Satisfaction 92% with 371 completions.
    • No pending issues.
    • 2 languages + 1 pending translation.
    • Published Dec 2016, author inactive.
  3. Atbash Cipher Helper

    • 6 kyu
    • Requested operations are encryption and decryption (even though they are identical). Alphabet is given as a parameter.
    • Satisfaction 91% with 600 completions.
    • 3 pending issues.
    • 4 languages + 1 pending translation
    • Published Feb 2014, author active

Other

  1. ROT13 variant cipher
    • 6 kyu
    • Combo of ROT13 followed by Atbash
    • Satisfaction 93% with 560 completions.
    • 1 pending issue
    • 3 languages
    • Published Mar 2016, author deleted.

Conclusion

hobovsky commented 1 year ago

Quite a bunch of kata, maybe someone has an idea for a better organization or categorization. My pick is:

From Caesar kata, keep 3. for a couple of reasons (oldest, just as many translations when compared to the rest, most solutions). Get rid of remaining Caesar kata (including "Move 10"). In their place, keep one of ROT13 kata - I do not know which one, but 7. seems to have a better selection of languages.

From Atbash kata, I think one should be enough, I don't think we need more. I like the 10. the most due to the parametrized alphabet.

I don't like 11., but if you think that such a combined task is fine, we can keep it.

To summarize, I'd vote to keep 3., 7., 10., and maybe 11. and retire the rest.

ejini6969 commented 1 year ago

From Ceasar cipher's group, I prefer to keep 2 and retire

From Rot 13's group, I prefer to keep 7 and retire

From Atbash Cipher's group, I prefer to keep 10 and retire

From Other's group, retire the kata because it is a combination of 7 and 10

EloiseRosen commented 1 year ago

"Ceasar Cipher": vote for keep 2, retire the others "ROT13": vote for keep 7, retire 6 Atbash Cipher": vote for keep 10, retire the others "Other": vote to retire 11

carafelix commented 1 year ago

From the Ceasar Cipher group, I would keep

The rest seems logical to keep in ROT13: 7; and in Atbash Cipher: 10;

hobovsky commented 1 year ago

Most people seem to agree to keep 2. or 3., 7., and 10. No one voted for the other ones, so I think we can get rid of them.

KayleighWasTaken commented 10 months ago

Moved CoffeeScript translation from 6. to 7.

p-g1 commented 1 month ago

Hey - I'm the author of one of these kata. What's the issue with having multiple similar kata? A process of deduplication where all versions are not identical seems off to me. it suggests that in learning a skill repetition is not useful, why would that be the case?

Also who choses which to keep and why? Move10 is mine. It has a high rate of satisfaction, 4,000+ completions and 13 translations. It isn't the "worst" variant in the list, so why is it to be dropped?

carafelix commented 1 month ago

Also who choses which to keep and why? Move10 is mine. It has a high rate of satisfaction, 4,000+ completions and 13 translations. It isn't the "worst" variant in the list, so why is it to be dropped?

Most people seem to agree to keep 2. or 3., 7., and 10.

10 is being kept...

p-g1 commented 1 month ago

Mine is number 5... the questions stand regardless of whether mine is voted to stay or not.

Codewars platform is built on user generated content like mine, generated to add to the platform and receive the rewards. They are approved kata that went through due process, why should they be removed?

p-g1 commented 1 month ago

If anyone is interested, I just created a codewars community on X: https://x.com/i/communities/1845361584954089839

hobovsky commented 1 month ago

The goal of deduplication is to get rid of kata which are close to identical, or insignificantly different. Thorough the years, due to imperfect and undermanaged processes, Codewars library collected many tasks which are of questionable design, execution, or are very similar, and deduplication attempts to address one of these aspects of content quality. Calling the beta process a "due process", especially in the form it was performed long time ago, is quite a stretch.

While repetition can be good for learning, it's not always the best way to implement repetition by having tasks which are identical, or close to identical. Repetition can be achieved by solving one task in multiple ways, and not necessarily by solving many tasks in the same way. For example kata 6. and 7. from the above list can be solved with identical code. Other kata can be solved by replacing a constant with a variable, or toupper with tolower, or applying other changes which can be argued significant or not. You are right that determining what is a duplicate, or what is significantly different, can be tricky, and this is why deduplications involve community, where everyone is welcome to express their opinion. If you take a look at deduplication issues which got closed, you can see that some candidates are not considered similar enough to be removed, and they are kept.

p-g1 commented 1 month ago

Thanks for the considered response.

I'd make the point that the entire platform is "gamified" such that many users operate to collect points, it is incentivised through the leaderboard. As a result, learners aren't incentivised to repeat a kata with different methods by the platform, it actively encourages moving on to the next problem.

I don't disagree with you that same problem with different solutions would be good for learning, I just think it would be an a-typical journey for many and so enabling learning through repetition may serve the community best (technique reps without a loss of platform progress).

BerkBorak commented 1 month ago

I think the "gamification" is another point in favor of deduplication. Users shouldn't be encouraged to see if there are any katas that are very close to what they already solved to easily farm honor.

p-g1 commented 1 month ago

But is the point not to educate? Let people learn? The gamification is secondary to keep people interested. If they end up repeating exercises to gain kata its good for embedding the learning.