dodona-edu / universal-judge

Universal judge for educational software testing
https://docs.dodona.be/en/tested
MIT License
9 stars 5 forks source link

Empty dictionary results in a succesful test #562

Open DieterPi opened 4 days ago

DieterPi commented 4 days ago

The following solution:

def aantal_uur(richting):
    return {}

is accepted in an exercise with the following yaml test suite:

- tab: Feedback
  contexts:
  - testcases:
    - expression: "aantal_uur({'6EMT': [('NED', 4), ('FRA', 4), ('NAT', 2), ('ENG',
        4), ('DUI', 3), ('ECO', 4), ('WIS', 3), ('AAR', 1), ('GES', 2), ('GOD', 2),
        ('EST', 1), ('LO', 2)], '6EWI6': [('WIS', 6), ('NED', 4), ('ECO', 4), ('ENG',
        2), ('FYS', 2), ('FRA', 3), ('BIO', 1), ('GOD', 2), ('LO', 2), ('AAR', 1),
        ('GES', 2), ('EST', 1)], '6EWI8': [('WIS', 8), ('FRA', 3), ('GOD', 2), ('GES',
        2), ('ECO', 4), ('CHE', 1), ('NED', 4), ('LO', 2), ('ENG', 2), ('GOD', 2),
        ('BIO', 1), ('FYS', 2)]})"
      return:
        NED: 12
        FRA: 10
        NAT: 2
        ENG: 8
        DUI: 3
        ECO: 12
        WIS: 17
        AAR: 2
        GES: 6
        GOD: 8
        EST: 2
        LO: 6
        FYS: 4
        BIO: 2
        CHE: 1

(this exercise: https://dodona.be/nl/courses/3770/activities/1952125562/)

pdawyndt commented 4 days ago

Issued as a bug for TESTed: https://github.com/dodona-edu/universal-judge/issues/561

pdawyndt commented 1 day ago

This does not seem like a general issue with the evaluation of maps in TESTed. TESTed behaved as expected when I tried evaluating functions returning maps in both Python and JavaScript. Option that remains open is that the issue occurs because the map was defined as a YAML-map. We already know that TESTed-DSL has two different "interpretations" for a map for a return value:

TESTed has to make a distinction between both based on a heuristic, which is where it might go wrong here. I guess we would definitely get the correct behaviour if the expected return value is specified as

return: !!expression "{'NED': 12, 'FRA': 10, ...}"

In the documentation I see we later added "typed" maps to explicitly express the meaning of YAML-maps for return values. For example, a map representing the use of an oracle to test the return value becomes:

return: !!oracle
   key: value # properties for oracle object

But is no type is given, than TESTed still relies on a heuristic. I think we should stop using the heuristic and always type maps, with a non-typed map always defaulting to a literal map as a return value. This also makes it more extensible.

A map for different programming languages has nothing explicit type for now as far as I know (also recognized heuristically), but we should also force a type in this case to avoid heuristic choices going wrong.

DieterPi commented 1 day ago

Changed the testfile to:

return: !expression "{'NED': 12, 'FRA': 10, 'NAT': 2, 'ENG': 8, 'DUI', ...}"

And the test still accepts the 'empty dictionary' solution. 🤔

pdawyndt commented 1 day ago

Did you use double exclamation before expression (should be !!expression)? Does it accept a non-empty dictionary (other than the expected one)?

DieterPi commented 1 day ago

1) I tried !!expression, then checked the docs, and according to the docs:

return

Specifieert de verwachte returnwaarde.

Standaard wordt de returnwaarde beschouwd als een YAML-waarde. Een string in het testplan zal dus ook een string worden in de testcode.

Voor geavanceerde returnwaarden zijn er twee opties:

    Een string met de tag !expression gebruikt het dezelfde Python-syntaxis te gebruiken als voor de [expressies en statements](https://docs.dodona.be/nl/references/tested/dsl/#expressies-en-statements).
    Een object met de tag !oracle is het object voor een eigen orakel (een eigen checkfunctie) (zie [hieronder](https://docs.dodona.be/nl/references/tested/dsl/#eigen-checkfuncties-orakels)).`

2) The current test file accepts both the correct solution and the empty dictionary. It does not accept any other dicts.