ArneBinder / dialam-2024-shared-task

see http://dialam.arg.tech/
2 stars 1 forks source link

candidate approaches & brainstorming ideas #2

Open ArneBinder opened 8 months ago

ArneBinder commented 8 months ago

This is meant to gather any ideas for potential baselines or enhanced approaches to tackle the shared task.

tanikina commented 8 months ago

What do we need to do?

1) Predict and annotate the YA-nodes that connect:

2) Predict and annotate the S-nodes that connect I-nodes (RA: inference, CA: conflict or MA: rephrase).

What do we get as input?

A set of L- and I-nodes, L-nodes are connected by TA-nodes indicating the dialogue turn transitions.

We can use/extract the following information for L- and I-nodes:

Existing/Recent Approaches for Relation Classification in ArgMining

  1. Simple relation classification baseline with RoBERTa (Ruiz-Dolz et al., 2021) Transformer-Based Models for Automatic Identification of Argument Relations: A Cross-Domain Evaluation

  2. LLM prompting with Llama-2 and Mistral (Gorur et al., 2024) "Can Large Language Models perform Relation-based Argument Mining?"

  3. GNN with graph augmentation (via adding virtual nodes) and collective classification (Zhang et al., 2023) "Argument Mining with Graph Representation Learning"

  4. Argument pair extraction with a graph matching network (Mao et al., 2023) "Seeing both sides: context-aware heterogeneous graph matching networks for extracting-related arguments"

Our Ideas

  1. Joint Modelling: First, separately train Model-YA to predict the YA nodes/edges and Model-S to predict the S nodes/edges, e.g., using RoBERTa baseline as in (Ruiz-Dolz et al., 2021). Next, train a joint model that can take and combine the predictions from Model-YA and Model-S as additional input during training.

  2. GNNs with pytorch_geometric: Encode all the graph nodes with GNN and then do "link prediction" task for the YA- and S-edges (possibly using pytorch_geometric?). Next, perform node classification to identify the exact "scheme" annotation for each node (e.g., Asserting, Arguing etc. for YA-nodes and Inference, Conflict, Rephrase for S-nodes).

  3. Contrastive Learning: Use contrastive learning (e.g., the SimCSE approach) for "pre-training" to make the embeddings of the related nodes (that have edges in-between) more similar to each other and then proceed with the pairwise classification for edge prediction and node classification.

  4. Model Ensembling: Train a variety of models on the same task (e.g., YA node/edge prediction) and then do ensembling (e.g., combine the predictions from RoBERTA, REMBERT etc.). This approach is quite typical for shared tasks and usually yields good results but is not very interesting per se.