Open AntoinePrv opened 3 years ago
Two ways this could be handled:
deepcopy
in the parsing/operators of the data functionsextract_xxx
in data function that calls extract
only once for the same transition number. This would make a copy every time (still better than recomputing).From meeting, we gonna go with ii.
Describe the bug
Doing something like
Or
Might potentially give an error, or worse give wrong results, because data functions assume
before_reset
/extract
to be called once per episode / transition.Expected behavior
I think making the previous example work as expected is a better option than forbidding reusing function. Even if documented, the latter would still leave the door open to silent bugs.
One solution could be that
extract
(andbefore_reset
) recieve an id, like transition_number, and be required to be a no-opt and return the same when a same transition_number is given again.Additional context
Alternatively, the environment could have made cached the output of data function, but this does not work in the case of a formula because the data function are captured privately (they could be deep-copied though).