dennybritz / cnn-text-classification-tf

Convolutional Neural Network for Text Classification in Tensorflow
Apache License 2.0
5.65k stars 2.77k forks source link

Training batches for imbalanced datasets #151

Open antriksh63 opened 6 years ago

antriksh63 commented 6 years ago

I have an imbalanced data set with around 100 entries of positive class and 4000 entries of negative class. One way to create the training batches would be to take 100 entries of positive and 100 entries of negative class and then allow the code to proceed as normal. However, this has high chances of overfitting. I think one thing that can be done is to have equal number of entries of positive and negative entries in the training batch.( batch_size/2 positive and negative entries). How can I do this?

self-ms commented 12 months ago

If your data is embedded and labels are available, you can use the following repository: https://github.com/ms-unlimit/Transformer-Based-Machine-Learning-Framework

xuqiangxq commented 12 months ago

这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

liangtingStduy commented 12 months ago

This is an automatic reply,confirming that your e-mail was received.Thank you!-------------------------------------liangting