Open deadmau5p opened 7 months ago
More data does not increase accuracy strongly, which is slightly weird.
Here: https://wandb.ai/aljazpotocnik/kaggle-ai-detection/sweeps/k2uv4pxq?workspace=user-aljazpotocnik I am running hyperparameter tunning on Roberta model.
We should add additional data to train the roberta model. This one looks good: https://www.kaggle.com/datasets/carlmcbrideellis/llm-mistral-7b-instruct-texts. It only contains ai generated essays, which is good as they are less represented in original competition dataset.