Z Su, J Zhang, T Zhu, X Qu, J Li, M Zhang, Y Cheng - arXiv preprint arXiv:2406.14192, 2024
Reasoning about time is essential for Large Language Models (LLMs) to understand
the world. Previous works focus on solving specific tasks, primarily on time-sensitive
question answering. While these methods have proven effective, they cannot …
S Guo, S Damani, K Chang - arXiv preprint arXiv:2406.19486, 2024
In prompt tuning, a prefix or suffix text is added to the prompt, and the embeddings
(soft prompts) or token indices (hard prompts) of the prefix/suffix are optimized to gain
more control over language models for specific tasks. This approach eliminates the …
R Hong, H Zhang, X Pan, D Yu, C Zhang - arXiv preprint arXiv:2406.12442, 2024
Abstract reasoning, the ability to reason from the abstract essence of a problem,
serves as a key to generalization in human reasoning. However, eliciting language
models to perform reasoning with abstraction remains unexplored. This paper seeks …
K Zhang, J Wang, N Ding, B Qi, E Hua, X Lv, B Zhou - arXiv preprint arXiv:2406.12295, 2024
Large Language Models (LLMs) demonstrate impressive performance in diverse
applications, yet they face significant drawbacks, including high inference latency,
expensive training cost, and generation of hallucination. Collaborative decoding …
L Zhao, W Zeng, X Shi, H Zhou, D Hao, Y Lin - arXiv preprint arXiv:2406.12182, 2024
Recently, both closed-source LLMs and open-source communities have made
significant strides, outperforming humans in various general domains. However, their
performance in specific professional fields such as medicine, especially within the …
J Lee, A Chen, Z Dai, D Dua, DS Sachan, M Boratko… - arXiv preprint arXiv …, 2024
Long-context language models (LCLMs) have the potential to revolutionize our
approach to tasks traditionally reliant on external tools like retrieval systems or
databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of …
J Wu, Y Xie, Z Yang, J Wu, J Chen, J Gao, B Ding… - arXiv preprint arXiv …, 2024
This study addresses the challenge of noise in training datasets for Direct Preference
Optimization (DPO), a method for aligning Large Language Models (LLMs) with
human preferences. We categorize noise into pointwise noise, which includes low …
K Li, Y Wang, F Viégas, M Wattenberg - arXiv preprint arXiv:2406.11978, 2024
We present an approach called Dialogue Action Tokens (DAT) that adapts language
model agents to plan goal-directed dialogues. The core idea is to treat each
utterance as an action, thereby converting dialogues into games where existing …
W Lu, X Zhao, J Spisak, JH Lee, S Wermter - arXiv preprint arXiv:2406.18505, 2024
Can emergent language models faithfully model the intelligence of decision-making
agents? Though modern language models exhibit already some reasoning ability,
and theoretically can potentially express any probable distribution over tokens, it …
M Sharma, N Muralidhar, S Xu, RB Yosuf… - arXiv preprint arXiv …, 2024
The pretraining-fine-tuning paradigm has been the de facto strategy for transfer
learning in modern language modeling. With the understanding that task adaptation
in LMs is often a function of parameters shared across tasks, we argue that a more …
This message was sent by Google Scholar because you're following new articles related to research by David Sontag.
Sent by Google Scholar Alerts (scholaralerts-noreply@google.com). Created by fire.
[PDF] Timo: Towards Better Temporal Reasoning for Language Models
Z Su, J Zhang, T Zhu, X Qu, J Li, M Zhang, Y Cheng - arXiv preprint arXiv:2406.14192, 2024
Reasoning about time is essential for Large Language Models (LLMs) to understand
the world. Previous works focus on solving specific tasks, primarily on time-sensitive
question answering. While these methods have proven effective, they cannot …
[PDF] LoPT: Low-Rank Prompt Tuning for Parameter Efficient Language Models
S Guo, S Damani, K Chang - arXiv preprint arXiv:2406.19486, 2024
In prompt tuning, a prefix or suffix text is added to the prompt, and the embeddings
(soft prompts) or token indices (hard prompts) of the prefix/suffix are optimized to gain
more control over language models for specific tasks. This approach eliminates the …
[PDF] Abstraction-of-Thought Makes Language Models Better Reasoners
R Hong, H Zhang, X Pan, D Yu, C Zhang - arXiv preprint arXiv:2406.12442, 2024
Abstract reasoning, the ability to reason from the abstract essence of a problem,
serves as a key to generalization in human reasoning. However, eliciting language
models to perform reasoning with abstraction remains unexplored. This paper seeks …
[PDF] Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
K Zhang, J Wang, N Ding, B Qi, E Hua, X Lv, B Zhou - arXiv preprint arXiv:2406.12295, 2024
Large Language Models (LLMs) demonstrate impressive performance in diverse
applications, yet they face significant drawbacks, including high inference latency,
expensive training cost, and generation of hallucination. Collaborative decoding …
[PDF] Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models
L Zhao, W Zeng, X Shi, H Zhou, D Hao, Y Lin - arXiv preprint arXiv:2406.12182, 2024
Recently, both closed-source LLMs and open-source communities have made
significant strides, outperforming humans in various general domains. However, their
performance in specific professional fields such as medicine, especially within the …
[PDF] Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
J Lee, A Chen, Z Dai, D Dua, DS Sachan, M Boratko… - arXiv preprint arXiv …, 2024
Long-context language models (LCLMs) have the potential to revolutionize our
approach to tasks traditionally reliant on external tools like retrieval systems or
databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of …
[PDF] Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization
J Wu, Y Xie, Z Yang, J Wu, J Chen, J Gao, B Ding… - arXiv preprint arXiv …, 2024
This study addresses the challenge of noise in training datasets for Direct Preference
Optimization (DPO), a method for aligning Large Language Models (LLMs) with
human preferences. We categorize noise into pointwise noise, which includes low …
[PDF] Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
K Li, Y Wang, F Viégas, M Wattenberg - arXiv preprint arXiv:2406.11978, 2024
We present an approach called Dialogue Action Tokens (DAT) that adapts language
model agents to plan goal-directed dialogues. The core idea is to treat each
utterance as an action, thereby converting dialogues into games where existing …
[PDF] Mental Modeling of Reinforcement Learning Agents by Language Models
W Lu, X Zhao, J Spisak, JH Lee, S Wermter - arXiv preprint arXiv:2406.18505, 2024
Can emergent language models faithfully model the intelligence of decision-making
agents? Though modern language models exhibit already some reasoning ability,
and theoretically can potentially express any probable distribution over tokens, it …
[PDF] Information Guided Regularization for Fine-tuning Language Models
M Sharma, N Muralidhar, S Xu, RB Yosuf… - arXiv preprint arXiv …, 2024
The pretraining-fine-tuning paradigm has been the de facto strategy for transfer
learning in modern language modeling. With the understanding that task adaptation
in LMs is often a function of parameters shared across tasks, we argue that a more …
This message was sent by Google Scholar because you're following new articles related to research by David Sontag.
List alerts
Cancel alert