RagnarGrootKoerkamp / BAPCtools

Tools for developing ICPC-style programming contest problems.
GNU General Public License v3.0
48 stars 18 forks source link

#271 [generate] Add count #353

Closed mzuenni closed 4 months ago

mzuenni commented 4 months ago

tried to implement count... its a bit hacky. Fixes #271

RagnarGrootKoerkamp commented 4 months ago

Hmm this makes a test fail. Didn't check yet but maybe the seeding subtly changed? Would be best to avoid that if we can.

Also this will need some tests of its own and docs.

mzuenni commented 4 months ago

maybe it would be nice to be able to use count "inline" i.e. combining it with this: - name: gen {seed} instead of needing to write this:

- name:
    generator: gen {seed}
    count: 5
RagnarGrootKoerkamp commented 4 months ago

Hmm, maybe something like {seed:count=5}? But that's already abused for changing the seed value.

Or completely replace {seed} by {count=N}?

On the other hand this is already a shorthand to generate multiple cases; spelling them out a bit more verbosely to clearly signal that this creates multiple cases may not be so bad?

mzuenni commented 4 months ago

beeing explicit with count is good it is just unusual to write the generate key as its the default else :D

RagnarGrootKoerkamp commented 4 months ago

Fair point. One option that we shouldn't do is

- name: <gen>
  count: cnt

but that is super ugly and also doesn't work for non-numbered cases.

mzuenni commented 4 months ago

that also doesn't work with the yaml parser if i remember correctly

mpsijm commented 4 months ago

Can you also add this key to the generators.yaml in the documentation folder, and perhaps also in the commented-out part of the skeleton? :slightly_smiling_face:

mzuenni commented 4 months ago

can someone update the cue to include count?

thorehusfeldt commented 4 months ago

can someone update the cue to include count?

Yes; maybe this is more of a regex question than a cue question.

Here’s the current regex for command:

// A command invokes a generator, like "tree --n 5".
// The regex restricts occurrences of curly-bracketed expressions
// to things like "tree --random --seed {seed:5}"
command: !="" & (=~"^[^{}]*(\\{seed(:[0-9]+)?\\}[^{}]*)*$")

The important part is the regex. It disallows curly-brackets except for {seed} and possibly stuff like {seed:2316}.

To make the new, better, regex, I’d need some examples of what is allowed (and what is forbidden) with {count}.

While we’re here: does {name} still exist? (Current CUE regex forbids it, as you can see, so it appears nowhere in the repo yaml.) It is now always "testcase", right?

Edit: Maybe what I write above misses the mark completely, and all you did was add the key

count: int &  >= 0 // or maybe >0 ?

to #testcase? That’s even easier to fix. Just tell me what to do.

RagnarGrootKoerkamp commented 4 months ago

The syntax is as here https://github.com/RagnarGrootKoerkamp/BAPCtools/pull/353#issuecomment-1981027148 Just a new count keyword alongside generator, with a value between 1 and 100 inclusive.

thorehusfeldt commented 4 months ago

By “alongside generator” do you mean a new key for a #testcase struct/map/dict, alongside generate, such as

testcase_name:
  generate: foo --bar 2 {seed}
  count: 23

If so, is the following allowed:

testcase_name:
  in: "34 5"
  count: 23

or can count only appear in a #testcase when generate occurs?

thorehusfeldt commented 4 months ago

I added my best interpretation of the syntax, but remain confused. In doc/generators.yaml we have

      12-counted:
          - generate: graph {seed:3}
          - count: 2                       # generate two testcases at once

Is this just wrong? (My schema complains; maybe that’s a good thing.)

mzuenni commented 4 months ago

copy is only valid if there is generate: <command> entry where command contains {seed} or {seed:[0-9]+} (not sure if its possible to check the seed).

The entry looks good to me? (the indention is also right or?) never mind there should be no minus XD

thorehusfeldt commented 4 months ago

CUE actually allows

    generate?: string
    if generate != _|_ {
        if generate =~ "{seed(:[0-9]+)?}" {
          count?: int & >= 1 & <= 100
        }
    }

which is exactly the rule “make count available if generate is available and conforms to this-and-that regex”. But I decide against this; let’s have a readable but not maximally invasive schema. (CUE syntax for conditional optional fields will change on this point anyway.)

thorehusfeldt commented 4 months ago

… now we just need to get 351, 353 (which both manipulate the schema) merged… I’ll stand way back and cover my ears.

thorehusfeldt commented 4 months ago

Another TODO is to update the JSON schema at schemastore. That’s a much more painful proces. I’ll wait with that until the dust settles; you have to live with your favourite editor complaining about count fields.

RagnarGrootKoerkamp commented 4 months ago

So currently the cue allows count even if there is no generate:. Fine by me.

@mzuenni How does generate.py handle this? Does it already warn if count is set alongside in: (or anything else than generate:)?

Thinking about it, there are in fact usecases for count: alongside in:, for example when the interactor internally generates a random seed and we just need 100 empty input files. So we should either support that or warn that it's not handled.

mzuenni commented 4 months ago

We generate a parse error if count is used without a generator with seed. Yes, there are use cases for this, however, we currently forbid identical rules anyway and count is no exception to this as it just duplicates the rule.

If you want I can lift that restriction for count by including count_index in the hash for the cache entry. We then could also propose count as a solution if we find identical rules instead of just skipping those duplicates?

mzuenni commented 4 months ago

Actually my prefered way right now would be this:

RagnarGrootKoerkamp commented 4 months ago

I was gonna write that we can postpone generalizing count until we need it, but that plan sounds nice, so go ahead :)

(We could still merge this already if you feel like it.)

mzuenni commented 4 months ago

Shouldn't be too complicated :)

mzuenni commented 4 months ago

I think right now everything works as expected?

RagnarGrootKoerkamp commented 4 months ago

Can you add some count rules to the identity test problem? Also for both generate: and in: and copy: rules