New submissions for Fri, 3 Feb 23

Keyword: text generation

There is no result

Keyword: machine translation

TransFool: An Adversarial Attack against Neural Machine Translation Models

Authors: Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2302.00944
Pdf link: https://arxiv.org/pdf/2302.00944
Abstract Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high. Moreover, we show that TransFool is transferable to unknown target models. Finally, based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks both in white-box and black-box settings. Thus, TransFool permits us to better characterize the vulnerability of NMT models and outlines the necessity to design strong defense mechanisms and more robust NMT systems for real-life applications.
Keyword: non-autoregressive

There is no result

Keyword: abstractive summarization

Curriculum-guided Abstractive Summarization for Mental Health Online Posts
Authors: Sajad Sotudeh, Nazli Goharian, Hanieh Deilamsalehy, Franck Dernoncourt
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2302.00954
Pdf link: https://arxiv.org/pdf/2302.00954
Abstract Automatically generating short summaries from users' online mental health posts could save counselors' reading time and reduce their fatigue so that they can provide timely responses to those seeking help for improving their mental state. Recent Transformers-based summarization models have presented a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have a prominent shortcoming; their training strategy is not quite efficient, which restricts the model's performance. In this paper, we include a curriculum learning approach to reweigh the training samples, bringing about an efficient learning procedure. We apply our model on extreme summarization dataset of MentSum posts -- a dataset of mental health related posts from Reddit social media. Compared to the state-of-the-art model, our proposed method makes substantial gains in terms of Rouge and Bertscore evaluation metrics, yielding 3.5% (Rouge-1), 10.4% (Rouge-2), and 4.7% (Rouge-L), 1.5% (Bertscore) relative improvements.
Keyword: factual

There is no result

Keyword: knowledge distillation

There is no result

Keyword: Hallucination

There is no result

Keyword: evaluation

Using In-Context Learning to Improve Dialogue Safety
Authors: Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin, Siva Reddy, Yang Liu, Dilek Hakkani-Tür
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2302.00871
Pdf link: https://arxiv.org/pdf/2302.00871
Abstract While large neural-based conversational models have become increasingly proficient as dialogue agents, recent work has highlighted safety issues with these systems. For example, these systems can be goaded into generating toxic content, which often perpetuates social biases or stereotypes. We investigate a retrieval-based framework for reducing bias and toxicity in responses generated from neural-based chatbots. It uses in-context learning to steer a model towards safer generations. Concretely, to generate a response to an unsafe dialogue context, we retrieve demonstrations of safe model responses to similar dialogue contexts. We find our proposed approach performs competitively with strong baselines which use fine-tuning. For instance, using automatic evaluation, we find our best fine-tuned baseline only generates safe responses to unsafe dialogue contexts from DiaSafety 2.92% more than our approach. Finally, we also propose a straightforward re-ranking procedure which can further improve response safeness.
History-Aware Hierarchical Transformer for Multi-session Open-domain Dialogue System
Authors: Tong Zhang, Yong Liu, Boyang Li, Zhiwei Zeng, Pengwei Wang, Yuan You, Chunyan Miao, Lizhen Cui
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2302.00907
Pdf link: https://arxiv.org/pdf/2302.00907
Abstract With the evolution of pre-trained language models, current open-domain dialogue systems have achieved great progress in conducting one-session conversations. In contrast, Multi-Session Conversation (MSC), which consists of multiple sessions over a long term with the same user, is under-investigated. In this paper, we propose History-Aware Hierarchical Transformer (HAHT) for multi-session open-domain dialogue. HAHT maintains a long-term memory of history conversations and utilizes history information to understand current conversation context and generate well-informed and context-relevant responses. Specifically, HAHT first encodes history conversation sessions hierarchically into a history memory. Then, HAHT leverages historical information to facilitate the understanding of the current conversation context by encoding the history memory together with the current context with attention-based mechanisms. Finally, to explicitly utilize historical information, HAHT uses a history-aware response generator that switches between a generic vocabulary and a history-aware vocabulary. Experimental results on a large-scale MSC dataset suggest that the proposed HAHT model consistently outperforms baseline models. Human evaluation results support that HAHT generates more human-like, context-relevant and history-relevant responses than baseline models.
TransFool: An Adversarial Attack against Neural Machine Translation Models
Authors: Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2302.00944
Pdf link: https://arxiv.org/pdf/2302.00944
Abstract Deep neural networks have been shown to be vulnerable to small perturbations of their inputs, known as adversarial attacks. In this paper, we investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool. To fool NMT models, TransFool builds on a multi-term optimization problem and a gradient projection step. By integrating the embedding representation of a language model, we generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples. Experimental results demonstrate that, for different translation tasks and NMT architectures, our white-box attack can severely degrade the translation quality while the semantic similarity between the original and the adversarial sentences stays high. Moreover, we show that TransFool is transferable to unknown target models. Finally, based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks both in white-box and black-box settings. Thus, TransFool permits us to better characterize the vulnerability of NMT models and outlines the necessity to design strong defense mechanisms and more robust NMT systems for real-life applications.
Curriculum-guided Abstractive Summarization for Mental Health Online Posts
Authors: Sajad Sotudeh, Nazli Goharian, Hanieh Deilamsalehy, Franck Dernoncourt
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Arxiv link: https://arxiv.org/abs/2302.00954
Pdf link: https://arxiv.org/pdf/2302.00954
Abstract Automatically generating short summaries from users' online mental health posts could save counselors' reading time and reduce their fatigue so that they can provide timely responses to those seeking help for improving their mental state. Recent Transformers-based summarization models have presented a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have a prominent shortcoming; their training strategy is not quite efficient, which restricts the model's performance. In this paper, we include a curriculum learning approach to reweigh the training samples, bringing about an efficient learning procedure. We apply our model on extreme summarization dataset of MentSum posts -- a dataset of mental health related posts from Reddit social media. Compared to the state-of-the-art model, our proposed method makes substantial gains in terms of Rouge and Bertscore evaluation metrics, yielding 3.5% (Rouge-1), 10.4% (Rouge-2), and 4.7% (Rouge-L), 1.5% (Bertscore) relative improvements.
Combining Deep Neural Reranking and Unsupervised Extraction for Multi-Query Focused Summarization
Authors: Philipp Seeberger, Korbinian Riedhammer
Subjects: Computation and Language (cs.CL)
Arxiv link: https://arxiv.org/abs/2302.01148
Pdf link: https://arxiv.org/pdf/2302.01148
Abstract The CrisisFACTS Track aims to tackle challenges such as multi-stream fact-finding in the domain of event tracking; participants' systems extract important facts from several disaster-related events while incorporating the temporal order. We propose a combination of retrieval, reranking, and the well-known Integer Linear Programming (ILP) and Maximal Marginal Relevance (MMR) frameworks. In the former two modules, we explore various methods including an entity-based baseline, pre-trained and fine-tuned Question Answering systems, and ColBERT. We then use the latter module as an extractive summarization component by taking diversity and novelty criteria into account. The automatic scoring runs show strong results across the evaluation setups but also reveal shortcomings and challenges.

LuckyyySTA / arxiv-daily

New submissions for Fri, 3 Feb 23 #58

Keyword: text generation

Keyword: machine translation

TransFool: An Adversarial Attack against Neural Machine Translation Models

Keyword: non-autoregressive

Keyword: abstractive summarization

Curriculum-guided Abstractive Summarization for Mental Health Online Posts

Keyword: factual

Keyword: knowledge distillation

Keyword: Hallucination

Keyword: evaluation

Using In-Context Learning to Improve Dialogue Safety

History-Aware Hierarchical Transformer for Multi-session Open-domain Dialogue System

TransFool: An Adversarial Attack against Neural Machine Translation Models

Curriculum-guided Abstractive Summarization for Mental Health Online Posts

Combining Deep Neural Reranking and Unsupervised Extraction for Multi-Query Focused Summarization