google-deepmind / bsuite

bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent
Apache License 2.0
1.51k stars 182 forks source link

In deep_sea the threshold is set to 0.8 instead of 0.9 as the paper indicates. #13

Closed lcassano closed 5 years ago

lcassano commented 5 years ago

The threshold is set to 0.8 by default instead of 0.9 as the paper indicates "The summary ‘score’ computes the percentage of runs for which the average regret drops below 0.9 faster than the 2^N episodes expected by dithering."

def find_solution(df_in: pd.DataFrame, sweep_vars: Sequence[Text] = None, merge: bool = True, thresh: float = 0.8) -> pd.DataFrame: