Closed msharara1998 closed 1 year ago
I agree that this is mostly caused by class imbalance, which is also the case in real-world social media (genuine users >> bots). Maybe a) include ML techniques that combat class imbalance in model training or b) create a subset of TwiBot-22 that is more balanced?
I'll consider these approaches, thanks!
Hello, Thank you for your hard efforts in making such a dataset. I noticed that the performance for most of the baseline algorithms applied on Twibot-22 is very low. Precisely, for the F1 score since it is not a balanced dataset. At the same time, the same algorithms achieve much higher F1 scores in Twibot-20 and other benchmarks. Is this supposed to be a problem in the dataset itself? What explains this low performance? Thanks in advance