Sourajit-Maity / juvdv2-vdvwc

5 stars 3 forks source link

Number of instances for various classes vary ridiculously. #4

Open vegam05 opened 3 months ago

vegam05 commented 3 months ago

As I said the number of instances for each class in the dataset varies very abruptly. I fine tuned a model and found that its scores were very insignificant. The first thing i tried to do was debug the number of instances to check if some classes were under-sampled, and found out that they indeed are. I wrote a script to check the total number of instances for each class and found it to be: 2024-07-04 17:21:26,413 - Class counts: 2024-07-04 17:21:26,414 - car: 10146 2024-07-04 17:21:26,414 - bike: 2379 2024-07-04 17:21:26,414 - auto: 1020 2024-07-04 17:21:26,414 - rickshaw: 1195 2024-07-04 17:21:26,414 - cycle van: 45 2024-07-04 17:21:26,414 - cycle: 291 2024-07-04 17:21:26,414 - taxi: 315 2024-07-04 17:21:26,414 - bus: 387 2024-07-04 17:21:26,414 - truck: 331 2024-07-04 17:21:26,414 - van: 336 2024-07-04 17:21:26,414 - minitruck: 343 2024-07-04 17:21:26,414 - boat: 166 2024-07-04 17:21:26,414 - motorvan: 9 2024-07-04 17:21:26,414 - toto: 132 2024-07-04 17:21:26,414 - train: 2 As you can see the distribution is ridiculous. The challenge states that we have to devise an efficient algorithm or strategy that can be utilized to train a model on dataset with diverse environmental conditions, but how can we focus on that when the dataset itself i so much biased towards some specific classes, assigning weights or oversampling would thus require incorporating lakhs of images for under-represented classes, which would in result increase the training time and require more resource. We being students don't think will be able to acquire such computational resources either. Kindly look into this issue.

5rujana commented 3 months ago

reach out to them via mail, they are responding quickly there

5rujana commented 2 months ago

Hey, do you how to submit GitHub repository?

vegam05 commented 2 months ago

No, it seems like they haven't provided an option for submission yet, it should be available on the website itself, right? Although 15th is the deadline, that would be today.... no communications or mails from their side yet! P.S.: They have provided a google forms link for the code submission on their website, but the deadline is extended til 25th