autogluon / autogluon

Fast and Accurate ML in 3 Lines of Code
https://auto.gluon.ai/
Apache License 2.0
7.48k stars 887 forks source link

Multimodal - handling imbalanced data #2482

Open smessica opened 1 year ago

smessica commented 1 year ago

Hi just wanted to ask if the multimodal predictor in AutoGluon handles imbalanced datasets automatically? Thanks!

smolix commented 1 year ago

it does but we could always get better at it. do you have a specific dataset or problem in mind where you'd like to try it out on?

On Sun, Nov 27, 2022 at 1:48 PM Shvat Messica @.***> wrote:

Hi just wanted to ask if the multimodal predictor in AutoGluon handles imbalanced datasets automatically? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/autogluon/autogluon/issues/2482, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7SSW3FFZLJG3RQ5HLZPDDWKO3ITANCNFSM6AAAAAASMVQB4E . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- ''~`` ( o o ) +----------------------.oooO--(_)--Oooo.---------------------------+ | Alexander J. Smola http://alex.smola.org | | 2100 University Ave phone: (+1) 408-759-1044 | | Palo Alto 94303 CA ( ) Oooo. Amazon Web Services | +-------------------------\ (----( )-----------------------------+

                       \_)    ) /
                             (_/
smessica commented 1 year ago

I'm working on predicting eye disease which can cause blindness and using a dataset containing medical records and Retina images of patients, I'm trying to model this problem both as binary and as multi-class (this disease has 5 stages), both highly imbalanced. Do I need to configure something?

Thanks!

rxjx commented 1 year ago

One thing that could be done here would be something similar to sklearn's class_weight='balanced'

sxjscience commented 1 year ago

@smessica For now, I think you can try to upsample the rare classes and call model.fit(). We are working on a tutorial about how to customize the loss function so that it can be balanced.

rob7112 commented 1 year ago

Wondering about the status of the tutorial you referred to above or even better, about the implementation of something like "sample_weight='balance_weight'" from TabularPredictor for the MultiModalPredictor... Thanks!

Crispy13 commented 10 months ago

Any update?

cfz1998 commented 3 weeks ago

Any update?