IAAR-Shanghai / DATG

[ACL 2024]Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
https://iaar-shanghai.github.io/DATG/
Apache License 2.0
30 stars 3 forks source link

Extending to multiple attributes #1

Open AvantiB opened 3 weeks ago

AvantiB commented 3 weeks ago

Hi,

Great paper and Congratulations on getting accepted to ACL2024! Can you describe how/if this method can be extended to multiple attributes other than binary (positive and negative) labels? Thanks

MarrytheToilet commented 3 weeks ago

Thank you very much for your careful reading and thoughtful consideration of our work.

When using LLMs for CTG tasks, we often encounter problems involving multiple categories, such as control over n topics. This requires discussion in two scenarios.

(1) For multi-category issues, if we only want to control one attribute,

Take the control of n topics as an example. Suppose we have an n-topic classifier and wish to control one specific topic. In this case, we can still divide it into two attributes: (1) the attribute we need and (2) the attribute we do not need. Thus, we only need to construct two graphs, one positive attribute graph representing alignment with our desired attribute, like a specific topic such as news, and one negative attribute graph representing deviation from our desired attribute. Both graphs have identical nodes and connections, with the only difference being the weight of the edges. The weights of the positive attribute graph are set by the scores of the scorer (e.g., the classifier score for the news topic), while the weights of the negative attribute graph are determined by the complement of the scorer scores (e.g., 1 minus the news topic classifier score). This allows our method to be easily extended to multi-class tasks, using multi-attribute classifiers to control LLMs to generate text with specific attributes.

(2) For multi-category issues, if we want to control multiple attributes,

When we want to control multiple attributes, such as controlling both politics and finance in a topic classifier to produce a text about financial policy, it may be necessary to create two sets of attribute graphs, representing political text attributes and financial text attributes, respectively. These graphs can be easily constructed, for example, by scoring with a multi-attribute classifier, scoring with multiple different classifiers, or even constructing the graph via calling APIs. Subsequently, the DATG method can be used as usual to control the key words sampled from the multiple sets of graphs.

At the same time, according to the inference time description for each step in our paper, if we want to control multiple attributes, the Dynamic Attribute Graphs Construction process will be repeated according to the number of controlled attributes, but the initial time for generating contextually relevant corpora will not increase, thus basically not adding to the time cost.

MarrytheToilet commented 3 weeks ago

We are truly honored that our research has caught your attention, and we eagerly look forward to future communication and collaboration. If you're interested in controllable text generation, we would also like to recommend our latest survey paper, "Controllable Text Generation for Large Language Models: A Survey." This paper focuses on the controllable generation of text by large language models.

As the demand for more diverse outputs from LLMs grows, research in Controllable Text Generation (CTG) is rapidly advancing. CTG refers to the ability to generate content that adheres to specific control conditions, such as safety, style, and sentiment, while maintaining fluency and diversity. In our survey, we explore the mechanisms for injecting control attributes into LLMs, their relationship with the inherent capabilities of language models, the classification of CTG tasks (content vs. attribute), major methods (retraining, fine-tuning, RL, prompts, latent space, and decoding-time), as well as evaluation metrics and application scenarios. We also analyze the challenges currently faced in CTG research and the future directions of the field.

The full survey and related resources can be accessed through the following links: