Reproduce results - Githubissues

adita15 commented 3 years ago

I am tryin to reproduce these results and I am quite confused w.r.t. the structure. Could you provide detailed setup instructions?

kamalojasv181 commented 3 years ago

Can you please particularly point out what part confuses you?

kamalojasv181 commented 3 years ago

I will write a generic workflow. First, you need to get the dataset. We have not posted the dataset here due to the policy of Constraint Shared Task. Get yourself registered with them to get the dataset and put it in the Dataset folder. The dataset must have two columns 1) the data 2) the labels. (we deleted the first row containing the column names and the first column containing serial numbers for each tweet). Now if you want to train the models yourself, make a directory by the name of models and run main_multitask_learning.py or main_bin_classification.py. If you wish to use our models, download them in the models folder in this directory. Now you can write your own script to generate results or use our script at your convenience. For anything specific, feel free to ask.

adita15 commented 3 years ago

Thank you for prompt response.

These are the points that I want to confirm-

The dataset passed in the model (combined) is combination of train, val and test files from the original dataset files
The number of epochs to train t get the said results
The model to be passed in as parameter to train model (should it be 'mrm8488/HindiBERTa’ or “ai4bharat/indic-bert” or anything else to get the results for auxiliary model)
Which script counts as baseline and which one counts as auxiliary?

Thank you, Aditi Damle.

On Apr 12, 2021, at 10:00 PM, OJASV Kamal @.***> wrote:

Can you please particularly point out what part confuses you?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kamalojasv181/Hostility-Detection-in-Hindi-Posts/issues/4#issuecomment-818375461, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLTGLCGHRT7HCLVIK33KATTIOQVLANCNFSM42ZXXM2A.

kamalojasv181 commented 3 years ago

1) Nope, for training, use training data, for validation, use validation data and generate csv on test data. 2) Use 10 epochs for all models. 3) Pass the one you want to generate results for. 4) Baseline model is the one mentioned in the paper by the workshop organisers (https://arxiv.org/abs/2011.03588). For the auxiliary approach, use the file main_multitask_learning.py . I can see why this might confuse someone. We were naive about the code. For now, use this info, I will update the repo in a day or two.

adita15 commented 3 years ago

Thank you for this response. Could you just confirm the parameters passed to the classifer in place of model_path (ai4bharat/indic-bert ) or anything else?This confusion is mainly because some of them were experiments and only one was you selected approach. Just so that we could reproduce exact results.

Also, if I wish to use pretrained model from your model files, where do I specify that while fine tuning using main_multitask_classfer.py?

On Apr 12, 2021, at 10:42 PM, OJASV Kamal @.***> wrote:

Nope, for training, use training data, for validation, use validation data and generate csv on test data. Use 10 epochs for all models. Pass the one you want to generate results for. Baseline model is the one mentioned in the paper by the workshop organisers (https://arxiv.org/abs/2011.03588 https://arxiv.org/abs/2011.03588). For the auxiliary approach, use the file main_multitask_learning.py . I can see why this might confuse someone. We were naive about the code. For now, use this info, I will update the repo in a day or two. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kamalojasv181/Hostility-Detection-in-Hindi-Posts/issues/4#issuecomment-818389772, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLTGLAVJJD7MU4YOQLCGU3TIOVSLANCNFSM42ZXXM2A.

kamalojasv181 commented 3 years ago

1) Yes, our best results were obtained on the ai4bharat/indic-bert using the auxiliary approach.

2) So the binaries released by us are already fine-tuned on the dataset of the workshop. You can either choose to fine-tune the original ai4bharat/indic-bert model on the same dataset and reproduce the models that we have released or just use our released models to directly generate results on the testset. There is no point fine tuning our model on the same dataset.

kamalojasv181 commented 3 years ago

Anything else? Should I close it?

adita15 commented 3 years ago

Hi Ojasv,

We really appreciate your responses and sorry to trouble you so much. Your responses so far have clarified most of our doubts. We just have final two queries.

If we are passing only Train file to train the data, I did not see any command to include val data set in the repo. So how does that thing work?
Also, from the paper, if linear SVM is the baseline model and we are training data using it to compare, why do we fit the data only using test file? Could you elaborate more on that part.

I am attaching the SS of the results we have got using the pre-trained model files. Kindly let us know if these look correct.

I am okay with closing the issue and I hope I could ask you more questions in the future with regards to this problem of hate speech detection. Its really amazing work!

Thank you once again, Aditi Damle.

On Apr 12, 2021, at 11:18 PM, OJASV Kamal @.***> wrote:

Anything else? Should I close it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kamalojasv181/Hostility-Detection-in-Hindi-Posts/issues/4#issuecomment-818400984, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLTGLGGKI2ZSE53PA7UV73TIOZYNANCNFSM42ZXXM2A.

kamalojasv181 commented 3 years ago

1) Actually, we did a very sloppy job. We combined the train and valid data(in the CSV), and we accordingly passed the spilt parameter. For now, please bear with us. I have noted this and will fix it very soon.

2) Can you please elaborate? Are you talking about the baseline paper?

adita15 commented 3 years ago

Its alright, We understand you as we are also students!

Here is the SS. The baseline model in the paper mentions SVM use. My understanding is that you guys are training the data on linear SVM and then comparing against your auxiliary approach to show the improvement. In such case, the data used here to fit SVM model is just the test set. So I am not clear about the purpose of this SVM code here in generate_csv.py

Best, Aditi

On Apr 12, 2021, at 11:59 PM, OJASV Kamal @.***> wrote:

Actually, we did a very sloppy job. We combined the train and valid data(in the CSV), and we accordingly passed the spilt parameter. For now, please bear with us. I have noted this and will fix it very soon.

Can you please elaborate? Are you talking about the baseline paper?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kamalojasv181/Hostility-Detection-in-Hindi-Posts/issues/4#issuecomment-818413045, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLTGLA2CAUKPYDO65JYHZLTIO6S3ANCNFSM42ZXXM2A.

kamalojasv181 commented 3 years ago

Ok. Our bad again!. We actually tried ensembling in the generate csv code, which did not work out for us. This is not the baseline implementation but result generation with ensambling. We forgot to delete the code. Thanks for pointing out.

adita15 commented 3 years ago

No problem!

So I will just delete all the SVM code for now.

For the baseline, I refereed the paper. I want to ask if you have implanted that with this dataset in you repo?

On Apr 13, 2021, at 12:14 AM, OJASV Kamal @.***> wrote:

Ok. Our bad again!. We actually tried ensembling in the generate csv code, which did not work out for us. This is not the baseline implementation but result generation with ensambling. We forgot to delete the code. Thanks for pointing out.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kamalojasv181/Hostility-Detection-in-Hindi-Posts/issues/4#issuecomment-818423707, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLTGLGFJQJ6QTYRDXBNO6LTIPAMHANCNFSM42ZXXM2A.

adita15 commented 3 years ago

I tried fine tuning using your script. I am still not able to reproduce the results. F1 scores lag by 2 digits for all tasks

siddjags commented 3 years ago

I am also facing a similar issue. Are we supposed to train the model with batch size 16? The current version of the code is using batch_size=8. Also, the pre-trained models do not give identical results on running generate_csv.py. Could you please help me with this? FYI I am trying to reproduce results for AUX Indic Bert. Here are the results that were obtained after running main_multitask_learning.py to train/fine-tune the model.

kamalojasv181 / Hostility-Detection-in-Hindi-Posts

Reproduce results #4