inoueakimitsu / milwrap

Wrapping single instance learning algorithms for fitting them to data for multiple instance learning
MIT License
1 stars 0 forks source link

Initial labels for imbalance data entry #11

Closed inoueakimitsu closed 2 years ago

inoueakimitsu commented 2 years ago

For data with large imbalance between classes, the initial labels may be inappropriate.

For example, rare classes cannot be assigned as initial labels and therefore cannot be trained.

inoueakimitsu commented 2 years ago

A policy of treating all classes equally is possible.

Each class defines the priority of the bags. For each bag, the class that gives the highest priority shall be the initial label for that bag.

The pseudo code is here:

n_instance[i_bag, i_class] = lower_bound[i_bag, i_class] or upper_bound[i_bag, i_class]
bag_order[:, i_class] = rankdata(n_instance[:, i_class])
initial_bag_labels[i_bag] = argmax(bag_order[i_bag, :])