Open thorehusfeldt opened 1 year ago
My thinking was to go the opposite: restructure this document into a very formal and strict specification, and move the fluffy and instructive parts into a tutorial.
Basically, a User's Guide vs an Implementer's Guide.
My feeling is that I have now is both more strict (it actually explains what’s going on) and more accessible than the version hosted by Kattis. (It’s certainly nowhere fluffy.) Check, for instance, my version of https://problem-package-format-copy.readthedocs.io/en/latest/scoring.html#grading
Oh, I didn't mean to imply that your version was in any way worse than today. It is quite clearly better than the current text in both aspects. I just meant that it's not strictly necessary to have one document for both purposes, if it compromises either aspect. It's of course possible that a unified one could be good enough for both purposes (and maybe your change is, I haven't reviewed it carefully yet).
Some typos:
problem_statement: A directory holding the testdata that team submissions will be run on.
This is an existing issue, but I always found the formulation
Alternatively it can be a ZIP compressed archive of such a directory with identical base name and extension
hard to understand. "with identical base name" is a weird phrase.
All file names I.e., it must be of length at least 2
This change in numerus bothers me.
we use a very simple example problem that is fully specified in the directory increment
is missing a period.
If needed a hyphen
I think praxis is a comma after a dependent clause.
can be left out, the default
should be a new sentence.
tex for LaTeX files, md for Markdown, or pdf for PDF.
Why LaTeX files, but only Markdown? (especially since only a single .tex is required)
\subsection*{Input}
Typically, we typeset these with \section
in statements.
working directory is
/problem_statement/.
You're happy referring to that as just problem_statement/
elsewhere, why not here?
the Problem name
lowercase, probably.
If it’s not included the
Please sir, may I have another comma?
Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/.
I never liked the "to be made available" formulation.
The sample data in data/sample/ and the secret data in data/secret/.
I prefer my sentences to be complete.
An test case
A test case
An test case has a unique base name such as data/secret/043-no-edges and is determined by its input file
What is determined by whats input file?
than .in may exist;
since you lead this with .in
, maybe this should be a period rather than a semicolon.
Included Code
is on the wrong section level.
The files should be copied from a language directory
I think "The files are copied" is fine here.
An incomplete list possible subdirectories
An incomplete sentence here.
At least one is program in accepted is required.
At least one is is erroneous.
does not crash for any test case
Period.
Crashes for some test file
Period.
Input Validators
Input validators
Checktestdata-file
Checktestdata-files
file ending
Ending or extension, we should pick one.
increment/problem.yaml¶
increment/input_validators/...
An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.
This should be better specified (it's not allowed to be an executable, for example).
using the arguments specified for the test data group
Maybe cross-link to input_validator_flags
here.
The input validator should be
must be
and integer representation
Is this really true? I would not expect 043
and 43
to match with the default validator!?
the time lmit
limit
(continuation in next comment so as to not lose data...)
the Format
Wir sprechen hier kein Deutsch!
problem_format_version
problem_format_version
https://www.kattis.com/ /problem-package-format/spec/problem_package_format/.
Broken link. Also, should be a link.
In ICPC, must be “pass-fail” In general
Missing period.
Lots of missing periods at the end of links in the license section.
attribution. See share alike; see
Why the difference?
UUID identifying the problem
Period.
scoring:
scoring:, interactive:, multi:
Missing formatting.
indicates that the submission should run multiple times with inputs generated by the validator, and
And what!? The suspense is killing me.
languages: Not ICPC
How ICPC/Not ICPC is specified seems very arbitrary now.
Contest systems
Judging systems!
relative and absolute tolerance
An example of what this means would be great for people not acquainted with these concepts.
to the Output validator specification
Bad casing.
When invoked the output
Comma
The validator should
must
feedbackdir
feedback directory or feedback_dir
.
Would be nice with one of your cool inline examples for output validators.
Sensible defaults for scoring problems.
Indeed!
@thorehusfeldt I like it a lot! It satisfies both my need for explanations and specification-level exactness. :shipit:
Thank you. We should focus on core.rst
and format.rst
(and in particular move even more stuff out of it, such as non-ICPC esoterica like attachments
.) I’ve only now started on settings.rst
. The rest is very fragile and doesn’t deserve your attention.
On that note, it's a bit weird that attachments are non-ICPC. Aren't they generally useful, and have indeed been used with interactive problems providing a testing tool? I guess they were also pushed out to people's desktops, but it feels very useful to be able to distribute them via the judging interface. @niemela @eldering
DOMjudge even distributes them already in the UI
After a first very quick read through I think this is much better. (Mostly because it's trying to be a helpful guide, the old text was only trying to be a specification). I have not verified that the rewrite didn't change the specification in some subtle way.
I agree with @jsannemo that it's conceivable that by having a formal specification and a separate tutorial you could get something better at both. But I also agree this text actually seems to do both really well, so at this point I don't see any need of another text.
Obviously (?) rewriting to this style of text and moving to readthedocs are orthogonal questions. (Rigth @thorehusfeldt? Or is there something I'm missing?)
I don't like how this text is making the "ICPC specification" seem like the normal. (e.g. "Nothing of this is part of the ICPC specification" in "Scoring Problems"). I don't know how to fix this (or, t.b.h. exactly what I mean by that).
On that note, it's a bit weird that attachments are non-ICPC.
Yes, it is.
I'm leaning more and more towards removing all mentions of the non-ICPC set from the spec. This would instead move this to the CCS specification document. (This is how we originally did it.)
Re: ICPC, we just need a way to express this, such as saying “an ICPC-type problem.”
There a many ways of doing this, any of them would be fine.
Two ideas are to allow type: "icpc"
or add a flag icpc: bool
to problem.yaml
. This would make many things much clearer. (And I don't understand the objections to it.)
Re: read the docs.
There are three components:
They are orthogonal and here ordered by importance to me. The present issue is about (1).
Re: ICPC, we just need a way to express this, such as saying “an ICPC-type problem.”
There a many ways of doing this, any of them would be fine.
Two ideas are to allow
type: "icpc"
or add a flagicpc: bool
toproblem.yaml
. This would make many things much clearer. (And I don't understand the objections to it.)
My objection is that I don't think this is what you actually want.
From an ICPC standpoint, they want to say: we use format X for making problems, and all judging systems used at the world finals have to support everything in it. First off, that's a bad reason to give importance to in a specification.
Also, the ICPC subset loses some of its value retroactively: once a new version is used at WF, what do the ICPC versions of the old specs really say? Furthermore, at WF, if judges would want to use some functionality that is present in the systems used at that specific WF, would they be restricted from it just because of the flag (if they know that e.g. Kattis and DOMJudge are used a certain year)?
Related to that, is the fact that DOMJudge now uses this format as its primary one. They primarily support this ICPC subset. But they don't promise to only support that, and having the ICPC subset be defined as "whatever the intersection of Kattis & DOMJudge functionality is at the time of publication" is weird. What is DOMJudge supposed to do if it receives a package that claims to "be ICPC" but isn't? Error out? Warn? If systems can determine what they support, clearly they should just do that.
So what does a flag in the metadata give here, exactly? It doesn't provide much value to me as a problem package re-user: I won't apriori know if a judge can consume my package anyway (an arbitrary system isn't required to support the ICPC subset anyway). As a judging system, I would want to detect if functionality from a version I don't support is used no matter what the ICPC flag is, and I certainly won't error out if you use too much.
What is left is for a ICPC WF judge to, when writing a problem, determine whether their problem is guaranteed to be OK at WF -- i.e. that their problem uses only that subset. I think they are competent enough to use a --icpc
flag to their problem verifier to check this, rather than forcing an organization specific implementation detail into the spec.
Two ideas are to allow
type: "icpc"
or add a flagicpc: bool
toproblem.yaml
. This would make many things much clearer. (And I don't understand the objections to it.)
Two main objections (which maybe are the same?):
Being type: icpc
is not at all like being type: scoring
(or type: pass-fail
). It's not a well defined idea property with at least a somewhat clear definition. It's actually just "what the ICPC WF specification has decided is required from a system that is to be used at WF" and thus (maybe) a "suitable subset to consieder for ICPC contests. There are ICPC contests that use parts of the specification outside of the ICPC subset. E.g. interactive problems where used at many contests before it was added to the ICPC subset (both as a general concept, and the specific version of that idea that's defined in the specification). It was only when the WF judges decided that they wanted to be able to have interactive problems that it was moved to the ICPC subset.
It can be deduced (fairly easily) from the rest of the package whether the problem is of "type: icpc
". This is just duplicating information. I think it's a good idea for tooling to have a way to say "please (strictly) verify that this problem package only uses the ICPC subset", I don't think it's a good idea for the package to duplicate this information in itself.
Also, the fact that you think about what the ICPC subset is like this, is kinda the reason why I would like to maybe remove mention of it completely from the specification. I still think it is a good idea for tooling to support the concept of "ICPC subset" though...
and having the ICPC subset be defined as "whatever the intersection of Kattis & DOMJudge functionality is at the time of publication" is weird. What is DOMJudge supposed to do if it receives a package that claims to "be ICPC" but isn't?
Just to be clear (and I know you are not actually saying this), this is not and has never been the definition of the ICPC subset. The definition of the ICPC subset is and has always been "what the ICPC standardization group has agreed to require from contest systems that are to be used at WF". This of course, at least in the past several years, has a strong correlation with "whatever the intersection of Kattis & DOMJudge functionality is at the time". (Maybe a better name would be the "ICPC WF subset"? I don't think it's better, but it is more accurate).
I think your point 1. is also a good indication of
I'm leaning more and more towards removing all mentions of the non-ICPC set from the spec. This would instead move this to the CCS specification document. (This is how we originally did it.)
being the way to go.
For example, assume for the sake of fantasy that version 2023-07 is so good that only editorial changes are made for 3 years, but some ICPC judge wants to introduce a nextpass problem! Is that now a new version of the spec, such that we should make 2026-07 just because that subset changed? Clearly not, that would be absurd: the ICPC subset must be able to live separately of editions of this problem package format.
Re: read the docs.
There are three components:
1. Change the structure
I think the structure changes are an improvement. (1a)
I'm not (at all) sure that the splitting into many very small pages is better. I can see that it's less scary, but having the right things first already improves a lot with respect to that. With the small pages you can't search in the document, because there are many. And our entire document is still pretty short anyway. Feel free to convince me. (1b)
2. Use RST, a more powerful markup than Markdown,
Where is RST format documented?
for instance allowing the source code inclusion
That's nice, but we don't really use it much anyway.
3. Host the result on makethedocs (rather than GitHub Pages).
Why is that better?
They are orthogonal and here ordered by importance to me. The present issue is about (1).
Sounds good. We'll get to 2 and 3 in time.
So far it sounds like everyone likes the idea of having a structure more aimed at humans is great (1a above).
What about 1b?
I'm fine with rST.
I kind of like the separate page thing, but I'm not sure why?
After some soul-searching, I changed https://problem-package-format-copy.readthedocs.io/en/latest/ to a single-page structure, and agree with @niemela that it is slightly better.
Ah, how sad. I actually stared liking the multi-page structure, for mainly two reasons:
Both of these points are much better over at https://www.kattis.com/problem-package-format/spec/2023-07-draft.html though, so maybe that's why I preferred the multi-page version on your copy.
Changed to sphinx_rtd_theme in https://problem-package-format-copy.readthedocs.io/en/latest/, which has a fixed navigation structure. This is my best (and presumably final) attempt to reconcile the conflicting constraints from @niemela and @jsannemo about presentational issues. I will be focussing on content now.
Ah, much better (and an actually decent mobile styling)!
Note also
There are some recent changes to author
andtestdata.yaml
missing, which I’ll incorporate tonight after getting .test.sh
working again
I find this much superior by now. I’ll send a PR to this repo soon, even though it basically just creates a completely new repo without much useful history. If there are other suggestion for how to migrate, I’m all ears.
The current specification starts by esoterica about what the entry point to a
prolog
problem is, then explains problem metadata such as rules forsource_url
, licensing, and what the systems defaults forcompilation_memory
are. Halfway down the text(!) we finally get toproblem_statement
.Apart from this (and related) mistakes for ordering a document aimed at humans, the documentation also pays undue weight to issues such as custom output validation or grading, which are used in very few problems but take up a lot of room.
Instead, there should be a way to read the documentation “breadth-first”, so that a relatively short document completely explains the relevant parts for 90% of the problems, enough for somebody who wants to make their first own problem: pass-fail, default output validation with default settings.
I’ve given this some thought, and come up with the structure laid out in https://problem-package-format-copy.readthedocs.io/en/latest/index.html.
In particular, the section Core Problem Package Format contains everything you need to make “Add two numbers”, with a running example. I think that core section is quite good by now.
How to cut the cake is somewhat arbitrary, but in my outline the core section is followed by more advanced concepts, which are
testdata.yaml
, manipulatingtime_limit
and the default output validator.testdata.yaml
and grading.There’s an attempt at adding useful examples, terminological stringency, and even a budding glossary.
The details are up in the air, but I want something like this. I need it (because I teach a course and need better documentation than what we have right now), so I’ll maintain it anyway, but for me the best outcome would be to move the documentation to such a structure that explains common things first and esoterica later.
Please discuss.