URL

https://arxiv.org/abs/2411.03350
Authors
- Fali Wang
- Zhiwei Zhang
- Xianren Zhang
- Zongyu Wu
- Tzuhao Mo
- Qiuhao Lu
- Wanjing Wang
- Rui Li
- Junjie Xu
- Xianfeng Tang
- Qi He
- Yao Ma
- Ming Huang
- Suhang Wang
  Abstract
- Large language models (LLM) have demonstrated emergent abilities in text generation, question answering, and reasoning, facilitating various tasks and domains. Despite their proficiency in various tasks, LLMs like LaPM 540B and Llama-3.1 405B face limitations due to large parameter sizes and computational demands, often requiring cloud API use which raises privacy concerns, limits real-time applications on edge devices, and increases fine-tuning costs. Additionally, LLMs often underperform in specialized domains such as healthcare and law due to insufficient domain-specific knowledge, necessitating specialized models. Therefore, Small Language Models (SLMs) are increasingly favored for their low inference latency, cost-effectiveness, efficient development, and easy customization and adaptability. These models are particularly well-suited for resource-limited environments and domain knowledge acquisition, addressing LLMs' challenges and proving ideal for applications that require localized data handling for privacy, minimal inference latency for efficiency, and domain knowledge acquisition through lightweight fine-tuning. The rising demand for SLMs has spurred extensive research and development. However, a comprehensive survey investigating issues related to the definition, acquisition, application, enhancement, and reliability of SLM remains lacking, prompting us to conduct a detailed survey on these topics. The definition of SLMs varies widely, thus to standardize, we propose defining SLMs by their capability to perform specialized tasks and suitability for resource-constrained settings, setting boundaries based on the minimal size for emergent abilities and the maximum size sustainable under resource constraints. For other aspects, we provide a taxonomy of relevant models/methods and develop general frameworks for each category to enhance and utilize SLMs effectively.
  Translation (by gpt-4o-mini)
大規模言語モデル（LLM）は、テキスト生成、質問応答、推論において新たな能力を示し、さまざまなタスクやドメインを支援しています。多様なタスクにおけるその能力にもかかわらず、LaPM 540BやLlama-3.1 405BのようなLLMは、大きなパラメータサイズや計算要求のために制限に直面しており、しばしばクラウドAPIの使用を必要とします。これによりプライバシーの懸念が生じ、エッジデバイスでのリアルタイムアプリケーションが制限され、ファインチューニングコストが増加します。さらに、LLMは医療や法律などの専門的なドメインにおいて、ドメイン特有の知識が不十分なためにパフォーマンスが低下することが多く、専門的なモデルが必要とされています。したがって、小型言語モデル（SLM）は、低い推論遅延、コスト効率、効率的な開発、簡単なカスタマイズと適応性のためにますます好まれています。これらのモデルは、リソースが限られた環境やドメイン知識の獲得に特に適しており、LLMの課題に対処し、プライバシーのためのローカライズされたデータ処理、効率のための最小限の推論遅延、軽量なファインチューニングを通じたドメイン知識の獲得を必要とするアプリケーションに理想的です。SLMの需要の高まりは、広範な研究と開発を促進しています。しかし、SLMの定義、獲得、応用、強化、信頼性に関連する問題を調査する包括的な調査はまだ不足しており、私たちはこれらのトピックに関する詳細な調査を行うことにしました。SLMの定義は広く異なるため、標準化のために、SLMを専門的なタスクを実行する能力とリソース制約のある設定に適したものとして定義することを提案し、新たな能力のための最小サイズとリソース制約の下で持続可能な最大サイズに基づいて境界を設定します。他の側面については、関連するモデル/方法の分類法を提供し、SLMを効果的に強化し活用するための一般的なフレームワークを各カテゴリに対して開発します。
Summary (by gpt-4o-mini)
大規模言語モデル（LLM）は多様なタスクで能力を示すが、パラメータサイズや計算要求から制限を受け、プライバシーやリアルタイムアプリケーションに課題がある。これに対し、小型言語モデル（SLM）は低遅延、コスト効率、簡単なカスタマイズが可能で、特に専門的なドメインにおいて有用である。SLMの需要が高まる中、定義や応用に関する包括的な調査が不足しているため、SLMを専門的なタスクに適したモデルとして定義し、強化するためのフレームワークを提案する。

AkihikoWatanabe / paper_notes

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness, Fali Wang+, arXiv'24 #1490

URL

Authors

Abstract

Translation (by gpt-4o-mini)

Summary (by gpt-4o-mini)