Closed cbhua closed 10 months ago
Attention: 1 lines
in your changes are missing coverage. Please review.
Files | Coverage Δ | |
---|---|---|
rl4co/models/rl/reinforce/baselines.py | 84.96% <50.00%> (ø) |
:loudspeaker: Thoughts on this report? Let us know!.
Let's have a closer look first, it might work but not for all baselines - if the baseline is multiple (i.e. symmetric and multistart) we should need the dim
on which to operate the mean
operation - for example here
Also @hyeok9855 you might want to have a look at this since you are working with POMO!
I agree. We should be careful to deal with this part. This bug is observed by @Leaveson . Let's prepare a clearer bug reproduce code to have a deep check.
In the shared baseline function, the size of the
reward
now is[batch_size]
.