New paper: Comprehensive Study on Performance Evaluation and Optimization of Model

Paper: Comprehensive Study on Performance Evaluation and Optimization of Model

Authors: Aayush Saxena, Arit Kumar Bishwas, Ayush Ashok Mishra, Ryan Armstrong

Abstract: Deep learning models have achieved tremendous success in most of theindustries in recent years. The evolution of these models has also led to anincrease in the model size and energy requirement, making it difficult todeploy in production on low compute devices. An increase in the number ofconnected devices around the world warrants compressed models that can beeasily deployed at the local devices with low compute capacity and poweraccessibility. A wide range of solutions have been proposed by differentresearchers to reduce the size and complexity of such models, prominent amongthem are, Weight Quantization, Parameter Pruning, Network Pruning, low-rankrepresentation, weights sharing, neural architecture search, knowledgedistillation etc. In this research work, we investigate the performance impactson various trained deep learning models, compressed using quantization andpruning techniques. We implemented both, quantization and pruning, compressiontechniques on popular deep learning models used in the image classification,object detection, language models and generative models-based problemstatements. We also explored performance of various large language models(LLMs) after quantization and low rank adaptation. We used the standardevaluation metrics (model's size, accuracy, and inference time) for all therelated problem statements and concluded this paper by discussing thechallenges and future work.

Link: https://arxiv.org/abs/2407.15904

Reasoning: produce the is_lm_paper. We start by examining the title and abstract for any mention of language models or related terms.

Title Analysis: The title "Comprehensive Study on Performance Evaluation and Optimization of Model" is quite broad and does not specifically mention language models. It suggests a focus on performance evaluation and optimization of various models.
Abstract Analysis: The abstract discusses deep learning models in general and mentions various techniques for model compression such as quantization and pruning. It specifically lists different types of models including image classification, object detection, language models, and generative models. Importantly, it mentions the exploration of performance of large language models (LLMs) after quantization and low-rank adaptation.

Given

ur-whitelab / LLMs-in-science

New paper: Comprehensive Study on Performance Evaluation and Optimization of Model #16