Open purp172 opened 2 years ago
Hello @Diogo-da-Silva-Rebelo, SMOGN was developed for regression. It seems like your problem is a classification one? If that is the case then SMOGN would note be useful. You may want to see if SMOTE is more appropriate. Thank you.
Hello @Diogo-da-Silva-Rebelo, SMOGN was developed for regression. It seems like your problem is a classification one? If that is the case then SMOGN would note be useful. You may want to see if SMOTE is more appropriate. Thank you.
Hello, @nickkunz ! Thank you for responding. I don't think that's the case: I want to predict the number of people in the room, and not a specific class (not if the room has or not people inside). In fact, there's many values for the target and not only a restricted number. However, the target values must be integers, because we can't have 1.2 persons in the room :) Thus, it is a regression problem, when I said that the dataset only has four values, it does not mean that I can't have another values for instance in my test dataset. The algorithm is leaving all rows with the target = 0, even being that the value in majority. And it's not balancing, since the other values remain intact. What are you thoughts?
Hey! Any idea on why is the algorithm creating a new class (value) for my target? I'm analyzing the Room_Occupancy_Dataset from Kaggle, and in this dataset the target only has four values for occupancy (0, 1, 2, 3 people in the room), but it is expected for the model to be able to predict other cases that have more than 3 people in the room. SMOGN is not balancing the data correctly, because the majority class (0) remains equal, and the minority classes (1,2,3) are not over-sampled. Plus, it creates an extra value (4). I don't know if this is a bug, but i hope you can help me fix it. This is my 2d array: