codewars / codewars.com

Issue tracker for Codewars
https://www.codewars.com
BSD 2-Clause "Simplified" License
2.09k stars 219 forks source link

Automatically generated kata translations #2323

Closed namedots closed 3 years ago

namedots commented 3 years ago

Big brain idea here. You're guaranteed to have considered it, nothing new, but I wish to talk about it.

There's this list. https://github.com/codewars/content-issues/wiki/List-of-Python-Kata-to-Update It's pretty long. The task to carry out is mechanical unless one makes additional improvements.

Most of these kata have json-like input and output. This makes them pretty much all the same. No human should be touching them. They can be automatically generated with the help of information about kata.

If something changes about a language, change the code generation, run it again. No more lists of 3800 kata to update.

The tests can be consistent across languages. Bugs get fixed in one place, not 5 or twenty. For performance requirements, the program that generates tests and checks answers could receive parameters to tailor size of tests.

MANY kata would become available in MOST languages. Or rather, after having been translated to the one language to rule them all.

So when one authors a kata, one would write this test generator, plus some meta data, some sort of schema. That would then be used to generate inital code and tests at the press of a button.

Obviously, this would be non-trivial. It would change kata authoring editor UI, and the backend would have to support running two languages. (static compilation might help, at the cost of limiting which languages can serve as testers) (sending it over network could be an option too, the data is typically small) Writing translation generators is a project all on its own. Figuring out the metadata format is somewhat of a challenge.

No clue whether you do translations over at Qualified. I suppose that would be the real selling point.

kazk commented 3 years ago

I'd like to eventually automate some mechanical fixes/changes. We have tests, so we can apply the change and make sure it's still valid in many cases. There are many existing tools, and we can maintain some of our own for each language. For example, there are many tools for JavaScript that transforms ASTs (transpiling is common). I made a PoC few years ago that can translate tests to Mocha, but I postponed the idea after seeing many kata using the test framework in very unexpected ways, and it was non-trivial to covert them.

But I'm strongly against automating translations. Qualified does support generating tests from data for many languages and test frameworks already, but it's not a great experience on both sides (authors and candidates). If interested, there's a blog from 4 years ago. It sounds great in theory, but it's not practical and leads to bad UX because languages are different. You'll get many languages supported, but none of them will be great (e.g., awkward types, arbitrary requirements that makes no sense, poor test feedback, etc.). You can only test serializable inputs and outputs too.

If something changes about a language, change the code generation, run it again. No more lists of 3800 kata to update.

What about 3800 reference solutions? If we're only generating tests, then we rarely need to update anyway.

MANY kata would become available in MOST languages. Or rather, after having been translated to the one language to rule them all.

If we valued this, we'll be doing challenges based on stdin/stdout like many others. Codewars was born because Jake didn't like that. I'd rather have one idiomatic translation than 50 half assed generated translations. Translation is great for learning too (we do need a better review system and let more experienced users to give feedback (mentoring)).

namedots commented 3 years ago

Codewars was born because Jake didn't like that.

Interesting! And yes, it can get soulless.

awkward types

But I'm strongly against automating translations.

A whole bunch are truly json in, json out though. Those aren't carefully crafted language-adapted translations. They are poorly maintained, owned by nobody, out of sync with their siblings. I should ask g964 (not because of maintenance quality, but because of the many json-shaped kata each with many translations) what they think of editing their kata, I suspect it I would get a rather depressing answer.

It does not fit everything. I wouldn't want it to be the single way to define kata, but a way to factor out the common part that all translations use. For many kata that is the whole thing. I do believe it is desirable to keep fixed tests in sync, and the random tests should also look the same!

Translation is great for learning too

It is! I'm very new to translating and maintaining and I've had a lot of fun rewriting some tests. But when I browse the aforementioned list of python kata to update, I find myself opening half a dozen links, looking at their tests, finding nothing different, closing each in turn and opening another handful of tabs with new links, hoping to find something novel to play with. I want to translate the fun stuff. Not .. those

blog

that blog is about how great it is, not about how bad it is :(


Would it make sense to have shared definitions that translations can draw on? Fixed tests. Random input/output pair generation.

kazk commented 3 years ago

A whole bunch are truly json in, json out though. Those aren't carefully crafted language-adapted translations.

I agree there are kata like that, and I know many of g964's are. But that doesn't mean we should add more. Their style is optimized for the ease of translation, and I've been always against it. Maybe they think it's easier to maintain, but it's often unidiomatic and annoying. For example, there are so many of them returning strings when there are much better ways to describe the result.

I won't make it easier to produce translations like that.

They are poorly maintained, owned by nobody, out of sync with their siblings.

There are many factors involved.

Generating can fix "out of sync" problem, but it'll create more problems in my opinion.

I should ask g964 (not because of maintenance quality, but because of the many json-shaped kata each with many translations) what they think of editing their kata, I suspect it I would get a rather depressing answer.

Why? They chose to author. Others can help (and there are plans to make that easier). Writing kata and translations are not for yourself (your benefits are side effects). If you don't enjoy learning new things, working with and helping other users, then you should probably stop authoring.

I know he's annoyed by updates sometimes, but he doesn't have to do it all by himself. If he doesn't want to maintain anymore, he should say so, and it's totally fine.

But when I browse the aforementioned list of python kata to update, I find myself opening half a dozen links, looking at their tests, finding nothing different, closing each in turn and opening another handful of tabs with new links, hoping to find something novel to play with. I want to translate the fun stuff. Not .. those

You're looking at the wrong place. Forget about the list and go translate whatever you enjoy instead. If you're not interested in helping with content quality, then you shouldn't be there.

that blog is about how great it is, not about how bad it is :(

Obviously. It's an announcement soon after the launch. We thought it'd be great...

Would it make sense to have shared definitions that translations can draw on? Fixed tests. Random input/output pair generation.

If you'd like to generate some code, you can already do that on Codewars. It's much more flexible than creating a general generator. Just write some code that outputs code in another language. Copy the code from the output. You'll obviously need to tweak for each kata, but you should anyway.

kazk commented 3 years ago

Closing because this won't happen.

kazk commented 3 years ago

From Codewars:

The minimal implementation would be to add another input box in the kata editor where a test generator program can be defined, to be called on by all language translations so that they can share the business logic of the kata.

@namedots Can you elaborate? Maybe some examples will help. Can you show me what "test generator program" looks like? How can it be called by or call all language translations?

Are you thinking of something like the following?

  1. Define serializable test cases (input and output)
  2. Write a test program that can test any language based on this.
  3. Profit!

A test program that can test any language is definitely not easy.

If we decide to do this, we'll most likely use stdin/stdout to communicate with the solution. Each language requires an adapter main that takes serialized input from stdin, call the solution with deserialized input, and serialize the output to stdout. The test program can then compare it against the serialized expected output. This is similar to what HackerRank and LeetCode are doing as far as I know.

Some problems:

  1. De/serialization part is tricky to make it idiomatic for all language for every kata. This depends on the problem and the language.
  2. Only serializable input and output can be tested.
  3. Test failure message will be cryptic for non-scalar outputs. Test only knows about the serialized version.
  4. Logs from solutions are gone because the adapter communicates with the test through stdout.
  5. What language should the test program be written in? Each language environments are separate (e.g., container image for JavaScript doesn't have Rust installed).
  6. Takes much longer to test. De/serialization and having multiple executables (assuming the rest of the tests are unit tests which doesn't make sense to me by the way).

I think THAT ^ part is easy. I think there's very little excuse to duplicate it or to have differences in it.

It's not excuses, but trade-offs.

Generating scaffolding based on schema is the difficult and problematic part, but it's a separate problem. This is where the language differences come in, aside from adjusting test size for performance reasons.

Not a separate problem at all.

If this is not universally useful or if the authors don't like this, it'll just create another inconsistency within Codewars. I'm also against introducing yet another feature that's not well-supported. For example, I wish Codewars didn't have avatars or clans until it could be properly implemented. I can imagine this feature causing more complaints and I don't think it's worth maintaining. As I wrote before, you can already generate repetitive setup.


I feel like we just disagree fundamentally. I'd encourage differences between languages as long as the tests are covering the same domain.

A language that doesn't affect the way you think about programming, is not worth knowing. -- Alan Perlis

I'm not as extreme, but I want Codewars to be a platform where you can easily play with different languages and learn different ways to think about the same problem.