Closed qiiiibeau closed 1 year ago
We might want to add support for this feature for samplers having a fitted attribute sample_indices_
after fit
.
Otherwise, the index is meaningless. However, it makes the behaviour different from one sampler to another while a user can easily reassign an index which would be less surprising:
df_res, y_res = sampler.fit_resample(df, y)
df_res.index = df.index[sampler.sample_indices_]
@chkoar do you have any thought on this?
This feature was added for the RandomUnderSampler
and RandomOverSampler
.
Hello, I'm undersampling some imbalanced data with each sample a unique name as index. I don't want to lose the samples' index after undersampling because I'm doing a graph - based task where each sample represent a node, I need to know where it is located in the graph.
RandomUnderSampler().fit_resample()
returns me a dataframe with index [0: length of selected samples] such aswhere all the original index are lost. I need it to be like:
This improvement would help a lot for graph-based imbalanced learning and maybe also in other cases.
Thank you.