Support for problem "variants"

niemela commented 1 year ago

There are some cases where there could be very similar "variants" of a problem. It would be nice to allow keeping them together in a problem package since they share most of the validation (and files). We already support one kind of what I'm referring to, in that translations are, in some sense, a "variant" of a problem. It's the same problem, but described in a different natural language.

These are almost by definition mostly useful for archive/online judges and similar. But OTOH, they could be defined in a way that would cause very little extra work for systems that don't want to actually use them.

These are the kinds of variants I think could be useful:

1) Problem statement

Sometimes you would want to have a variant problem statement. This would be a problem statement describing the exact same actual problem, but with a completely (?) different story. This is useful for teaching and practice, and is typically done by just making a new problem package and changing the statement.

This could be supported by the format by extending the problem.<language>.<filetype> into problem.<variant>.<language>.<filetype>. Just like for <language>, <variant> is optional and would default to "the default" variant. I imagine that a system that support variants would compile the variant statements as well (when validating statements) and allow users to choose which variant to install (e.g. with a command line option). A system that does not want to support variants could simply ignore them.

One open question is how to make sure that a variant can never be mistaken for a language or vice versa, since they are both optional. One solution would be to require variants to consist of at least 4 characters from [A-Za-z0-9]. Another solution would b e to use <variant>.problem.<language>.<filetype>. I think the former is better.

We also need to allow for the variant name in problem.yaml.

2) Problem type

Often a problem could be used as both a pass-fail or scoring problem. This can trivially be handled by a judging system by adding a score. I.e. when using a pass-fail as a scoring problem, the score is the score you get if the problem passes, and when using a scoring as pass-fail problem, the score is the score you need to pass. In practice this score is most often 100 (or the max score, which is most often 100).

That said, when this is done the "Scoring" section (or lack thereof) get's a bit awkward. Automatically adding a scoring section would typically be trivial. Just add a scoring section that says "you get 100 points if you solve all the test cases".

Going in the other direction automatically is little bit less trivial. You can't just remove the scoring information, even though the scores are now irrelevant, some of the levels that exist can often/sometimes be a hint for how to solve the problem, and you may or may not want to keep them. You can of course add a text saying "you solve the problem if you get 100 points according to the above".

If we want to do something better than what's suggested above there are 2 similar solutions:

we could allow sections in the problem statement that are used when using a pass-fail as scoring or vice versa
we could allow a special variant-name to be used for this case. I.e. add problem.scoring.tex to a pass-fail problem, and that statement will be used when using it as a scoring problem.

I think the second solution is better. It also means that systems that don't want to support it can ignore it similar to above.

3) Difficulty

Sometimes we have problems that are (mostly) exactly the same, except they have different levels of test data. This is very common in teaching (because many problems from contests are too hard for many teaching use cases).

A few examples from this on Kattis:

Problems that fits in this category (IMO) will differ on:

problem statements (but typically very little)
parameters to input validators
test data But everything else should be shared.

The problem statement changes could be handled the same as above (with the same benefits as above). That said, the differences tend to be very small, so it's a bit more annoying.

The other differences could be handled similarly but with variant testdata.yamls. This would require the code for all validators to be the same, I think that's a good limitation.

We would probably need a way to say "actually skip this group" in testdata,yaml.

Systems that don't want to support this can ignore as above, except that they would have to understand the "actually skip this group".

Thoughts?

RagnarGrootKoerkamp commented 1 year ago

This sounds like it would introduce quite a bit of complexity to support properly, especially documenting which testcases belong to which variant.

Probably we could do something generic like any file <name>.<ext> can now also be present as <name>.<variant>.<ext>, and will override <name>.<ext>, but I'm not sure how much work and how ugly it would turn out.

niemela commented 1 year ago

This sounds like it would introduce quite a bit of complexity to support properly, especially documenting which testcases belong to which variant.

I assume you are referring to case 3? Yes, I agree, that is the worst of the three.

Probably we could do something generic like any file <name>.<ext> can now also be present as <name>.<variant>.<ext>, and will override <name>.<ext>, but I'm not sure how much work and how ugly it would turn out.

I don't think you would actually typically have alternate versions of specific files, rather every version would exclude different sets of files. If you only look at the contents of the files, then this is of course the same, but when you also look at the <name> then it makes a difference.

I was thinking something like adding something to testdata.yaml to ignore some test data for some variants, and maybe limit it to ignoring whole groups only (simply because testdata,yaml is per group, not per test case).

niemela commented 1 year ago

A much simpler solution to this (suggested by @pehrsoderman) would be to simply have a variantof: <uuid> key in problem.yaml that point to some other problem package. A system that installs both could do nothing, simply link the problems (but duplicate all data), or do something smarter.

niemela commented 6 months ago

I feel the consensus on this is "close as wontfix" for the main proposal?

What about the much smaller follow-up of adding a variantof key in problem.yaml?

@Tagl @RagnarGrootKoerkamp @mzuenni @evouga ?

RagnarGrootKoerkamp commented 6 months ago

I don't have much use for this but variantof sounds trivial to add to the spec so sure.

niemela commented 6 months ago

Closing as wontfix.

If someone feels a bit stronger about adding variantof create an issue for that (or even better, just write the PR 😄).

Kattis / problem-package-format

Support for problem "variants" #126