tuhinjubcse / AMPERSAND-EMNLP2019

Code and Data for EMNLP 2019 paper titled AMPERSAND: Argument Mining for PERSuAsive oNline Discussions
22 stars 7 forks source link

Getting started with AMPERSAND #3

Closed faithannong closed 3 years ago

faithannong commented 3 years ago

Hi,

I'm a computer science undergraduate doing my final year project on argument mining. I came across your AMPERSAND paper and would like to use it in my project to further explore argument mining. Thank you for kindly making it available đź‘Ť

At the moment I would like to test its performance for other subreddits. I have read the supplementary pdf, README, and parts of the BERT documentation. However, I still have some questions about how to get started running AMPERSAND:

  1. Are the models linked in the README already finetuned? Do we need to train them ourselves before we load and use them?

  2. Could you give a concrete example of the input data's format i.e. the subreddit data. In a previous issue, it was said that "File format is tsv Sentence1\tLabel1 Sentence1\tLabel2 Sentence1\tLabel3" but what are the sentences and labels?

  3. How do I run your full pipeline system? Does the following run the input data through the entire argument component classification, relations identification, and RST, etc.?

    
    export GLUE_DIR=/path/to/data

python run_classifier.py \ --task_name ARG \ --do_eval \ --data_dir $GLUE_DIR/ \ --bert_model bert-base-uncased \ --max_seq_length 128 \ --train_batch_size 32 \ --learning_rate 2e-5 \ --output_dir /tmp/output/



Thank you very much.
tuhinjubcse commented 3 years ago
  1. They are fine-tuned models on distantly labeled data. You have to load them and fine tune on labeled data.

  2. Sentence1 , Sentence2 are claim / premise / non-arg the ones you are classifying. For relation classification this will be a pair Sentence1 \t Sentence2 where Sentence1 and Sentence2 are pair of arguments.

  3. Yes thats how you can run the full pipeline system. For RST you have to parse. I don’t have the files with me. You have to run argument component and relation classification separately. Once the predicted argument components are done you can do all pair combination and do relation classification. I recommend just using your the BERT part first