Closed MaikeZuefle closed 11 months ago
Hello!
We are getting quite close to the deadline (September 1, 11:59PM anywhere on earth), which is why I wanted to remind you of the fact that your PR still needs some attention: see the automated check that failed.
Please don't forget to submit your accompanying paper to Openreview via https://openreview.net/group?id=GenBench.org/2023/Workshop by September 1.
Good luck finalising your PR and paper, feel free to tag us if you have questions. Cheers, Verna On behalf of the GenBench team
@vernadankers @dieuwkehupkes I tried to submit our data split. Unfortunately, the test_task check failed.
Error: Task with id 'latent_feature_split' does not exist. Please specify a valid task id. Error: Process completed with exit code 2.
However, our task id is "latent_feature_based_data_split" and not "latent_feature_split".
Could you run the checks using "genbench-cli test-task --id latent_feature_based_data_split", which is the new task id? I do not get an error when I run this.
In comparison to the sample submission in August, I changed the title of our task and also added subtasks. I see that these exist under "Files changed". Is there a way for me to change the id?
@MaikeZuefle We're in the process of merging the tasks into the repo. In order to merge your task, we need the following changes:
Please host the dataset files somewhere else and submit a new PR without the files (even if you remove the files from the current PR, the files are gonna still remain in the git history)
Could you please include a single file usage_example.py
of each task where you use each task for finetuning and evaluation of a model the way you intent your tasks must be used. Preferably, it should be done on commonly used pretrained huggingface model. Please also include requirements-usage-example.txt for the python dependencies needed to be installed for running the example.
Hate Speech Detection
This project aims to go beyond the random train-test split by developing a more challenging data-splitting process to better evaluate generalisation performance. We rely on a models internal representations to create a data split, creating the split by clustering the internal representations and assigning clusters to either the train or the test set. Hate Speech is used as a testing ground for developing the splitting method.
Authors
m.s.zufle@sms.ed.ac.uk
v.dankers@sms.ed.ac.uk
ititov@inf.ed.ac.uk
Checklist:
genbench-cli test-task
tool.