yassersami / sampler

0 stars 1 forks source link

storing.join_history #4

Open yassersami opened 2 months ago

yassersami commented 2 months ago

You made the following modification to join_history to handle cases where there are more samples than the sum of initial_data and max_size:

if run_condition['run_until_max_size']:
    df_out = df_out.iloc[-(initial_size + run_condition['max_size']):]

However, this approach removes inliers that we want to retain in the final CSV file. To address this, you could consider using dropna on the target columns to keep only the inliers. This way, the total size of the combined CSV will correctly reflect (initial_data + max_size). But in any case, we could just keep all the samples and treat them later when computing metrics.