Reusing data functions lead to wrong results - Githubissues

ds4dm / ecole

Extensible Combinatorial Optimization Learning Environments

https://www.ecole.ai

BSD 3-Clause "New" or "Revised" License

325 stars 68 forks source link

Reusing data functions lead to wrong results #197

Open AntoinePrv opened 3 years ago

AntoinePrv commented 3 years ago

Describe the bug

Doing something like

r = ecole.reward.XXX()
ecole.environment.Branching(
    reward_function=(r + r**2),
)

Or

r = ecole.reward.XXX()
ecole.environment.Branching(
    reward_function=r,
    information_function={"r": r},
)

Might potentially give an error, or worse give wrong results, because data functions assume before_reset / extract to be called once per episode / transition.

Expected behavior

I think making the previous example work as expected is a better option than forbidding reusing function. Even if documented, the latter would still leave the door open to silent bugs.

One solution could be that extract (and before_reset) recieve an id, like transition_number, and be required to be a no-opt and return the same when a same transition_number is given again.

Additional context

Alternatively, the environment could have made cached the output of data function, but this does not work in the case of a formula because the data function are captured privately (they could be deep-copied though).

AntoinePrv commented 2 years ago

Two ways this could be handled:

1. Throw a couple of deepcopy in the parsing/operators of the data functions
1. Add a mechanism counting the transitions in the the environment and a parent method extract_xxx in data function that calls extract only once for the same transition number. This would make a copy every time (still better than recomputing).
From meeting, we gonna go with ii.