Kattis / problem-package-format

Kattis problem package format specification
https://www.kattis.com/problem-package-format
9 stars 14 forks source link

Pay More Attention To Typical Use #100

Open thorehusfeldt opened 1 year ago

thorehusfeldt commented 1 year ago

The current specification starts by esoterica about what the entry point to a prolog problem is, then explains problem metadata such as rules for source_url, licensing, and what the systems defaults for compilation_memory are. Halfway down the text(!) we finally get to problem_statement.

Apart from this (and related) mistakes for ordering a document aimed at humans, the documentation also pays undue weight to issues such as custom output validation or grading, which are used in very few problems but take up a lot of room.

Instead, there should be a way to read the documentation “breadth-first”, so that a relatively short document completely explains the relevant parts for 90% of the problems, enough for somebody who wants to make their first own problem: pass-fail, default output validation with default settings.

I’ve given this some thought, and come up with the structure laid out in https://problem-package-format-copy.readthedocs.io/en/latest/index.html.

In particular, the section Core Problem Package Format contains everything you need to make “Add two numbers”, with a running example. I think that core section is quite good by now.

How to cut the cake is somewhat arbitrary, but in my outline the core section is followed by more advanced concepts, which are

  1. Problem settings in testdata.yaml, manipulating time_limit and the default output validator.
  2. Custom output validation. This includes problems with more than one answer, interactive problems, possibly multipass problems.
  3. Scoring problems. This is mainly about testdata.yaml and grading.

There’s an attempt at adding useful examples, terminological stringency, and even a budding glossary.

The details are up in the air, but I want something like this. I need it (because I teach a course and need better documentation than what we have right now), so I’ll maintain it anyway, but for me the best outcome would be to move the documentation to such a structure that explains common things first and esoterica later.

Please discuss.

jsannemo commented 1 year ago

My thinking was to go the opposite: restructure this document into a very formal and strict specification, and move the fluffy and instructive parts into a tutorial.

Basically, a User's Guide vs an Implementer's Guide.

thorehusfeldt commented 1 year ago

My feeling is that I have now is both more strict (it actually explains what’s going on) and more accessible than the version hosted by Kattis. (It’s certainly nowhere fluffy.) Check, for instance, my version of https://problem-package-format-copy.readthedocs.io/en/latest/scoring.html#grading

jsannemo commented 1 year ago

Oh, I didn't mean to imply that your version was in any way worse than today. It is quite clearly better than the current text in both aspects. I just meant that it's not strictly necessary to have one document for both purposes, if it compromises either aspect. It's of course possible that a unified one could be good enough for both purposes (and maybe your change is, I haven't reviewed it carefully yet).

jsannemo commented 1 year ago

Some typos:

problem_statement: A directory holding the testdata that team submissions will be run on.

This is an existing issue, but I always found the formulation

Alternatively it can be a ZIP compressed archive of such a directory with identical base name and extension

hard to understand. "with identical base name" is a weird phrase.

All file names I.e., it must be of length at least 2

This change in numerus bothers me.

we use a very simple example problem that is fully specified in the directory increment

is missing a period.

If needed a hyphen

I think praxis is a comma after a dependent clause.

can be left out, the default

should be a new sentence.

tex for LaTeX files, md for Markdown, or pdf for PDF.

Why LaTeX files, but only Markdown? (especially since only a single .tex is required)

\subsection*{Input}

Typically, we typeset these with \section in statements.

working directory is /problem_statement/.

You're happy referring to that as just problem_statement/ elsewhere, why not here?

the Problem name

lowercase, probably.

If it’s not included the

Please sir, may I have another comma?

Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/.

I never liked the "to be made available" formulation.

The sample data in data/sample/ and the secret data in data/secret/.

I prefer my sentences to be complete.

An test case

A test case

An test case has a unique base name such as data/secret/043-no-edges and is determined by its input file

What is determined by whats input file?

than .in may exist;

since you lead this with .in, maybe this should be a period rather than a semicolon.

Included Code

is on the wrong section level.

The files should be copied from a language directory

I think "The files are copied" is fine here.

An incomplete list possible subdirectories

An incomplete sentence here.

At least one is program in accepted is required.

At least one is is erroneous.

does not crash for any test case

Period.

Crashes for some test file

Period.

Input Validators

Input validators

Checktestdata-file

Checktestdata-files

file ending

Ending or extension, we should pick one.

increment/problem.yaml

increment/input_validators/...

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

This should be better specified (it's not allowed to be an executable, for example).

using the arguments specified for the test data group

Maybe cross-link to input_validator_flags here.

The input validator should be

must be

and integer representation

Is this really true? I would not expect 043 and 43 to match with the default validator!?

the time lmit

limit

(continuation in next comment so as to not lose data...)

jsannemo commented 1 year ago

the Format

Wir sprechen hier kein Deutsch!

problem_format_version

problem_format_version

https://www.kattis.com/ /problem-package-format/spec/problem_package_format/.

Broken link. Also, should be a link.

In ICPC, must be “pass-fail” In general

Missing period.

Lots of missing periods at the end of links in the license section.

attribution. See share alike; see

Why the difference?

UUID identifying the problem

Period.

scoring:

scoring:, interactive:, multi:

Missing formatting.

indicates that the submission should run multiple times with inputs generated by the validator, and

And what!? The suspense is killing me.

languages: Not ICPC

How ICPC/Not ICPC is specified seems very arbitrary now.

Contest systems

Judging systems!

relative and absolute tolerance

An example of what this means would be great for people not acquainted with these concepts.

to the Output validator specification

Bad casing.

When invoked the output

Comma

The validator should

must

feedbackdir

feedback directory or feedback_dir.

Would be nice with one of your cool inline examples for output validators.

Sensible defaults for scoring problems.

Indeed!

jsannemo commented 1 year ago

@thorehusfeldt I like it a lot! It satisfies both my need for explanations and specification-level exactness. :shipit:

thorehusfeldt commented 1 year ago

Thank you. We should focus on core.rst and format.rst (and in particular move even more stuff out of it, such as non-ICPC esoterica like attachments.) I’ve only now started on settings.rst. The rest is very fragile and doesn’t deserve your attention.

jsannemo commented 1 year ago

On that note, it's a bit weird that attachments are non-ICPC. Aren't they generally useful, and have indeed been used with interactive problems providing a testing tool? I guess they were also pushed out to people's desktops, but it feels very useful to be able to distribute them via the judging interface. @niemela @eldering

nickygerritsen commented 1 year ago

DOMjudge even distributes them already in the UI

niemela commented 1 year ago

After a first very quick read through I think this is much better. (Mostly because it's trying to be a helpful guide, the old text was only trying to be a specification). I have not verified that the rewrite didn't change the specification in some subtle way.

I agree with @jsannemo that it's conceivable that by having a formal specification and a separate tutorial you could get something better at both. But I also agree this text actually seems to do both really well, so at this point I don't see any need of another text.

Obviously (?) rewriting to this style of text and moving to readthedocs are orthogonal questions. (Rigth @thorehusfeldt? Or is there something I'm missing?)

I don't like how this text is making the "ICPC specification" seem like the normal. (e.g. "Nothing of this is part of the ICPC specification" in "Scoring Problems"). I don't know how to fix this (or, t.b.h. exactly what I mean by that).

niemela commented 1 year ago

On that note, it's a bit weird that attachments are non-ICPC.

Yes, it is.

I'm leaning more and more towards removing all mentions of the non-ICPC set from the spec. This would instead move this to the CCS specification document. (This is how we originally did it.)

thorehusfeldt commented 1 year ago

Re: ICPC, we just need a way to express this, such as saying “an ICPC-type problem.”

There a many ways of doing this, any of them would be fine.

Two ideas are to allow type: "icpc" or add a flag icpc: bool to problem.yaml. This would make many things much clearer. (And I don't understand the objections to it.)

thorehusfeldt commented 1 year ago

Re: read the docs.

There are three components:

  1. Change the structure
  2. Use RST, a more powerful markup than Markdown, for instance allowing the source code inclusion
  3. Host the result on makethedocs (rather than GitHub Pages).

They are orthogonal and here ordered by importance to me. The present issue is about (1).

jsannemo commented 1 year ago

Re: ICPC, we just need a way to express this, such as saying “an ICPC-type problem.”

There a many ways of doing this, any of them would be fine.

Two ideas are to allow type: "icpc" or add a flag icpc: bool to problem.yaml. This would make many things much clearer. (And I don't understand the objections to it.)

My objection is that I don't think this is what you actually want.

From an ICPC standpoint, they want to say: we use format X for making problems, and all judging systems used at the world finals have to support everything in it. First off, that's a bad reason to give importance to in a specification.

Also, the ICPC subset loses some of its value retroactively: once a new version is used at WF, what do the ICPC versions of the old specs really say? Furthermore, at WF, if judges would want to use some functionality that is present in the systems used at that specific WF, would they be restricted from it just because of the flag (if they know that e.g. Kattis and DOMJudge are used a certain year)?

Related to that, is the fact that DOMJudge now uses this format as its primary one. They primarily support this ICPC subset. But they don't promise to only support that, and having the ICPC subset be defined as "whatever the intersection of Kattis & DOMJudge functionality is at the time of publication" is weird. What is DOMJudge supposed to do if it receives a package that claims to "be ICPC" but isn't? Error out? Warn? If systems can determine what they support, clearly they should just do that.

So what does a flag in the metadata give here, exactly? It doesn't provide much value to me as a problem package re-user: I won't apriori know if a judge can consume my package anyway (an arbitrary system isn't required to support the ICPC subset anyway). As a judging system, I would want to detect if functionality from a version I don't support is used no matter what the ICPC flag is, and I certainly won't error out if you use too much.

What is left is for a ICPC WF judge to, when writing a problem, determine whether their problem is guaranteed to be OK at WF -- i.e. that their problem uses only that subset. I think they are competent enough to use a --icpc flag to their problem verifier to check this, rather than forcing an organization specific implementation detail into the spec.

niemela commented 1 year ago

Two ideas are to allow type: "icpc" or add a flag icpc: bool to problem.yaml. This would make many things much clearer. (And I don't understand the objections to it.)

Two main objections (which maybe are the same?):

  1. Being type: icpc is not at all like being type: scoring (or type: pass-fail). It's not a well defined idea property with at least a somewhat clear definition. It's actually just "what the ICPC WF specification has decided is required from a system that is to be used at WF" and thus (maybe) a "suitable subset to consieder for ICPC contests. There are ICPC contests that use parts of the specification outside of the ICPC subset. E.g. interactive problems where used at many contests before it was added to the ICPC subset (both as a general concept, and the specific version of that idea that's defined in the specification). It was only when the WF judges decided that they wanted to be able to have interactive problems that it was moved to the ICPC subset.

  2. It can be deduced (fairly easily) from the rest of the package whether the problem is of "type: icpc". This is just duplicating information. I think it's a good idea for tooling to have a way to say "please (strictly) verify that this problem package only uses the ICPC subset", I don't think it's a good idea for the package to duplicate this information in itself.

Also, the fact that you think about what the ICPC subset is like this, is kinda the reason why I would like to maybe remove mention of it completely from the specification. I still think it is a good idea for tooling to support the concept of "ICPC subset" though...

niemela commented 1 year ago

and having the ICPC subset be defined as "whatever the intersection of Kattis & DOMJudge functionality is at the time of publication" is weird. What is DOMJudge supposed to do if it receives a package that claims to "be ICPC" but isn't?

Just to be clear (and I know you are not actually saying this), this is not and has never been the definition of the ICPC subset. The definition of the ICPC subset is and has always been "what the ICPC standardization group has agreed to require from contest systems that are to be used at WF". This of course, at least in the past several years, has a strong correlation with "whatever the intersection of Kattis & DOMJudge functionality is at the time". (Maybe a better name would be the "ICPC WF subset"? I don't think it's better, but it is more accurate).

jsannemo commented 1 year ago

I think your point 1. is also a good indication of

I'm leaning more and more towards removing all mentions of the non-ICPC set from the spec. This would instead move this to the CCS specification document. (This is how we originally did it.)

being the way to go.

For example, assume for the sake of fantasy that version 2023-07 is so good that only editorial changes are made for 3 years, but some ICPC judge wants to introduce a nextpass problem! Is that now a new version of the spec, such that we should make 2026-07 just because that subset changed? Clearly not, that would be absurd: the ICPC subset must be able to live separately of editions of this problem package format.

niemela commented 1 year ago

Re: read the docs.

There are three components:

1. Change the structure

I think the structure changes are an improvement. (1a)

I'm not (at all) sure that the splitting into many very small pages is better. I can see that it's less scary, but having the right things first already improves a lot with respect to that. With the small pages you can't search in the document, because there are many. And our entire document is still pretty short anyway. Feel free to convince me. (1b)

2. Use RST, a more powerful markup than Markdown, 

Where is RST format documented?

for instance allowing the source code inclusion

That's nice, but we don't really use it much anyway.

3. Host the result on makethedocs (rather than GitHub Pages).

Why is that better?

They are orthogonal and here ordered by importance to me. The present issue is about (1).

Sounds good. We'll get to 2 and 3 in time.

So far it sounds like everyone likes the idea of having a structure more aimed at humans is great (1a above).

What about 1b?

jsannemo commented 1 year ago

https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html

jsannemo commented 1 year ago

I'm fine with rST.

I kind of like the separate page thing, but I'm not sure why?

thorehusfeldt commented 1 year ago

After some soul-searching, I changed https://problem-package-format-copy.readthedocs.io/en/latest/ to a single-page structure, and agree with @niemela that it is slightly better.

jsannemo commented 1 year ago

Ah, how sad. I actually stared liking the multi-page structure, for mainly two reasons:

Both of these points are much better over at https://www.kattis.com/problem-package-format/spec/2023-07-draft.html though, so maybe that's why I preferred the multi-page version on your copy.

thorehusfeldt commented 1 year ago

Changed to sphinx_rtd_theme in https://problem-package-format-copy.readthedocs.io/en/latest/, which has a fixed navigation structure. This is my best (and presumably final) attempt to reconcile the conflicting constraints from @niemela and @jsannemo about presentational issues. I will be focussing on content now.

jsannemo commented 1 year ago

Ah, much better (and an actually decent mobile styling)!

thorehusfeldt commented 1 year ago

Note also

  1. the improved markup for keys in Problem Settings (which include type),
  2. the inclusion of CUE schemas
  3. various warning, note, and hint stylings
  4. the systematic inclusion of a running example in the Basic Concepts section.

There are some recent changes to author and testdata.yaml missing, which I’ll incorporate tonight after getting test.sh working again.

I find this much superior by now. I’ll send a PR to this repo soon, even though it basically just creates a completely new repo without much useful history. If there are other suggestion for how to migrate, I’m all ears.