New feature: QBAFs and its solver

jiqicn commented 11 months ago

Detailed explanation of the idea of the new feature

The primary objective is to construct a tree-structured graph associated with the set of reviews and assign a numerical value to each argument to represent its initial strength. Subsequently, we will employ quantitative solvers to evaluate the arguments. In our dataset, some reviews have received votes, indicating their helpfulness to readers. These reviews are considered "helpful reviews." We've used classifiers with the target of voted reviews, taking into account features like the length, readability, and sentiment of the reviews. Our goal is to introduce argumentation features and assess whether they can enhance the classifiers' performance. To quantitatively incorporate these features into the classifiers, we've chosen to work with Quantitative Bipolar Argumentation Frameworks (QBAFs). Creating a QBAF involves several steps. One significant distinction between AFs and QBAFs is that the latter includes support relations, and each node in a QBAF is associated with a weight. To construct a QBAF, we need to extract arguments and their relations from our dataset. Additionally, we must assign a base weight or strength to each argument. In the linked article, Figure 4 demonstrates the construction of a Bipolar Argumentation Framework (BAF) to feed into classifiers to detect deceptive reviews. We propose a similar approach for our dataset. https://direct.mit.edu/coli/article/44/4/833/1614/Combining-Deep-Learning-and-Argumentative Argument Mining: Our dataset already contains sets of chunks, with each chunk corresponding to a specific topic. We can consider each of these chunks as an argument, we call them chunk-arguments. However, we currently have 26 clusters (topics), which might be excessive given the dataset's size. Is it possible to reduce the number of clusters by merging similar ones? As shown in Figure 4 of the article, each topic i is represented by a variable, G_i, to evaluate its quality. We can use each chunk in cluster i as an argument, making it an ancestor of G_i. Link Mining: The next consideration is how to establish links from a chunk-argument in cluster i (argument) to G_i. One approach is to create a link from any chunk-argument within cluster i to G_i, considering any chunk in the cluster as a parent of G_i. However, an alternative structure may make more sense. We could draw links as shown in Figure 4, where if a and b are two chunks within cluster i, and b appears in a more recent review than a, we draw a link from b to a. If a is the first chunk in cluster i concerning the order of review presentation, we draw a link from a to G_i. Following this approach, each cluster i forms a tree structure with G_i as the root. Since our ultimate aim is to evaluate the quality of product p, we'll represent the product with the variable G_p and draw links from G_i to G_p. Determining Link Types: To determine the types of links between arguments, we propose the following constraints: 1- If a is a chunk argument and G_i represents topic i, and there exists a link (a, G_i), then if the sentiment of a is positive, it's a support link; otherwise, it's an attack relation. 2- If a and b are chunk arguments, and there is a link (b, a), and if b and a share the same sentiment, it's a support link; otherwise, it's an attack relation. 3- For (G_i, G), we'll learn the type of relation after evaluating the weight of G_i. If the weight of G_i is greater than or equal to 0.5, we'll draw a support relation from G_i to G; otherwise, it's an attack relation. The weight of arguments will be discussed next. Initial Weight (Base Score) Extraction: We already have a Bipolar Argumentation Framework (BAF) associated with our dataset after extracting arguments and their relations. We need to allocate base scores to reflect the strength of each argument to have a QBAF. Initially, since we have no information about the topics in the reviews, we suggest considering a neutral base score of 0.5 for all G_i variables. The same applies to the base score for the product, i.e., bs(G_p) = 0.5. In the article you referenced, all chunk arguments have base scores set to 0.5. We can also explore other possibilities, such as considering sentiment scores or other quantitative measures as base scores for chunk arguments. Which measure do you think best represents the strength of a chunk? Quantitative Semantics: After reaching this stage, we have a QBAF associated with our dataset. In the literature, various methods are presented for evaluating the final strength of arguments by considering their parents' strengths, including DF-QuAD, Euler's method, and the quadratic-energy model. All of these methods exhibit the convergent property when the graph has a tree structure. Therefore, all three methods are applicable in our case. However, in cases where an argument has an attacker with a strength close to 1 and a supporter with strength close to one, the aggregation function of DF-QuAD tends to move close to 0. Euler's method addresses this issue, but it exhibits an imbalance between the attackers and supporters. The quadratic-energy model effectively resolves both of these concerns. Hence, it is our recommendation for achieving better results. You can find these methods in the following git repository. We can compare the results of different methods, using the following repository. https://github.com/nicopotyka/Uncertainpy/blob/master/examples/gradual/gradual_argumentation_examples.ipynb https://www.ifaamas.org/Proceedings/aamas2019/pdfs/p1722.pdf ([backup link](https://arxiv.org/pdf/1809.07133.pdf)) Evaluating the Impact of Reviews: Our primary goal is to assess the impact of each review on the classifiers. To achieve this, we first select an evaluation method, such as DF-QuAD. We calculate the weight of G_p using this method. Next, to evaluate the impact of a review, let's say r_i, on this weight, we remove r_i from our list of reviews. We construct a new graph by eliminating r_i, and then we calculate the weight of G_p after removing r_i. The impact of r_i is the difference between the new weight and the previous weight of G_p. We repeat this process for all reviews. These newly generated values serve as a new feature in the classifiers. We anticipate that this new feature will positively impact the performance of the classifiers, as observed in similar works.

Tasks that should be planned prior to developing this as a new feature of the module:

[x] #76

jiqicn commented 11 months ago

In Link Mining:

Following this approach, each cluster i forms a tree structure with G_i as the root.

Maybe I didn't understand correctly, why a tree structure formed but not a linked list?

jiqicn commented 11 months ago

For (G_i, G), we'll learn the type of relation after evaluating the weight of G_i. If the weight of G_i is greater than or equal to 0.5, we'll draw a support relation from G_i to G; otherwise, it's an attack relation. The weight of arguments will be discussed next.

It seems that G_p should be computed before the direction of (G_i, G_p) is determined, right?