h2oai / h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
https://h2o.ai
Apache License 2.0
3.89k stars 402 forks source link

[BUG] Downloading a fine-tuned model fails with error #639

Closed vgorovoy closed 5 months ago

vgorovoy commented 5 months ago

🐛 Bug

After I fine-tune mistral model in LLM studio I press download locally button. It fails and when I press report an error it gives the following stacktrace:

Report Issue: H2O LLM Studio at http://10.130.0.25:10101/

q.app script_sources: ['/_f/3956a614-98ec-4f1b-811c-1c34d467abfc/tmp3dysitur.min.js'] initialized: True version: 1.4.0-dev name: H2O LLM Studio heap_mode: False wave_utils_stack_trace_str: ### stacktrace Traceback (most recent call last):

File "/home/shperling/h2o-llmstudio/./llm_studio/app_utils/handlers.py", line 332, in handle await experiment_download_model(q)

File "/home/shperling/h2o-llmstudio/./llm_studio/app_utils/sections/experiment.py", line 1627, in experiment_download_model cfg, model, tokenizer = load_cfg_model_tokenizer(

File "/home/shperling/h2o-llmstudio/./llm_studio/app_utils/sections/chat.py", line 201, in load_cfg_model_tokenizer load_checkpoint(cfg, model, strict=False)

File "/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py", line 237, in load_checkpoint model = load_model_weights(model, model_weights, strict, cfg)

File "/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py", line 153, in load_model_weights model_weights = {

File "/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py", line 160, in else model_state_dict[k]

KeyError: 'backbone.base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight.quant_state.bitsandbytes__nf4'

q.user q.client app_db: <llm_studio.app_utils.db.Database object at 0x7f60b7074a60> client_initialized: True mode_curr: error theme_dark: True credential_saver: .env File default_aws_bucket_name: bucket_name default_azure_conn_string: default_azure_container: default_kaggle_username: set_max_epochs: 50 set_max_batch_size: 256 set_max_gradient_clip: 10 set_max_lora_r: 256 set_max_lora_alpha: 256 gpu_used_for_chat: 1 default_number_of_workers: 8 default_logger: None default_neptune_project: default_openai_azure: False default_openai_api_base: https://example-endpoint.openai.azure.com default_openai_api_deployment_id: deployment-name default_openai_api_version: 2023-05-15 default_gpt_eval_max: 100 default_safe_serialization: True delete_dialogs: True chart_plot_max_points: 1000 init_interface: True notification_bar: None nav/active: home experiment/list/mode: train dataset/list/df_datasets: id name path config_file ... train rows validation dataframe validation rows labels 2 3 vivid_faqdataset /home/shperling/h2o-llmstudio/data/user/vivid... /home/shperling/h2o-llmstudio/data/user/vivid_... ... 583 None None answer 1 2 dpo /home/shperling/h2o-llmstudio/data/user/dpo /home/shperling/h2o-llmstudio/data/user/dpo/te... ... 12859 None chosen 0 1 oasst /home/shperling/h2o-llmstudio/data/user/oasst /home/shperling/h2o-llmstudio/data/user/oasst/... ... 13026 None output

[3 rows x 10 columns] experiment/list/df_experiments: id name mode dataset config_file path ... metric eta val metric progress status info 1 2 venomous-emu train vivid_faq_dataset Text Causal Language Modeling /home/shperling/h2o-llmstudio/output/user/veno... ... BLEU 3.3103 1.0 finished Runtime: 00:01:40 0 1 serious-capybara train vivid_faq_dataset Text Causal Language Modeling /home/shperling/h2o-llmstudio/output/user/seri... ... BLEU 4.2081 1.0 finished Runtime: 00:02:03

[2 rows x 16 columns] expander: True dataset/list: False dataset/list/table: [] experiment/list: False experiment/list/table: ['0'] wave_submission_name: report_error experiment/display/id: 0 experiment/display/logs_path: None experiment/display/preds_path: None experiment/display/tab: experiment/display/charts experiment/display/experiment_id: 2 experiment/display/experiment: <llm_studio.app_utils.db.Experiment object at 0x7f61483e7910> experiment/display/experiment_path: /home/shperling/h2o-llmstudio/output/user/venomous-emu/ experiment/display/charts: {'cfg': {'experiment_name': 'venomous-emu', 'llm_backbone': 'mistralai/Mistral-7B-v0.1', 'personalize': False, 'chatbot_name': 'h2oGPT', 'chatbot_author': 'H2O.ai', 'train_dataframe': '/home/shperling/h2o-llmstudio/data/user/vivid_faq_dataset/vivid_faq_dataset.csv', 'validation_strategy': 'automatic', 'validation_dataframe': 'None', 'validation_size': 0.01, 'data_sample': 1.0, 'data_sample_choice': ['Train', 'Validation'], 'system_column': 'None', 'prompt_column': ('question',), 'answer_column': 'answer', 'parent_id_column': 'None', 'text_system_start': '<|system|>', 'text_prompt_start': '<|prompt|>', 'text_answer_separator': '<|answer|>', 'limit_chained_samples': False, 'add_eos_token_to_system': True, 'add_eos_token_to_prompt': True, 'add_eos_token_to_answer': True, 'mask_prompt_labels': True, 'max_length_prompt': 256, 'max_length_answer': 256, 'max_length': 512, 'add_prompt_answer_tokens': False, 'padding_quantile': 1.0, 'use_fast': True, 'backbone_dtype': 'int4', 'gradient_checkpointing': True, 'force_embedding_gradients': False, 'intermediate_dropout': 0.0, 'pretrained_weights': '', 'loss_function': 'TokenAveragedCrossEntropy', 'optimizer': 'AdamW', 'learning_rate': 0.0001, 'differential_learning_rate_layers': [], 'differential_learning_rate': 1e-05, 'use_flash_attention_2': False, 'batch_size': 2, 'epochs': 1, 'schedule': 'Cosine', 'warmup_epochs': 0.0, 'weight_decay': 0.0, 'gradient_clip': 0.0, 'grad_accumulation': 1, 'lora': True, 'lora_r': 4, 'lora_alpha': 16, 'lora_dropout': 0.05, 'lora_target_modules': '', 'save_best_checkpoint': False, 'evaluation_epochs': 1.0, 'evaluate_before_training': False, 'train_validation_data': False, 'token_mask_probability': 0.0, 'skip_parent_probability': 0.0, 'random_parent_probability': 0.0, 'neftune_noise_alpha': 0.0, 'metric': 'BLEU', 'metric_gpt_model': 'gpt-3.5-turbo-0301', 'metric_gpt_template': 'general', 'min_length_inference': 2, 'max_length_inference': 256, 'batch_size_inference': 0, 'do_sample': False, 'num_beams': 1, 'temperature': 0.0, 'repetition_penalty': 1.0, 'stop_tokens': '', 'top_k': 0, 'top_p': 1.0, 'gpus': ['0'], 'mixed_precision': True, 'compile_model': False, 'use_deepspeed': False, 'deepspeed_reduce_bucket_size': 1000000, 'deepspeed_stage3_prefetch_bucket_size': 1000000, 'deepspeed_stage3_param_persistence_threshold': 1000000, 'find_unused_parameters': False, 'trust_remote_code': True, 'huggingface_branch': 'main', 'number_of_workers': 8, 'seed': -1, 'logger': 'None', 'neptune_project': ''}, 'train': {'loss': {'steps': [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504, 506, 508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530, 532, 534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554, 556, 558, 560, 562, 564, 566, 568, 570, 572, 574, 576], 'values': [2.394770383834839, 3.241992235183716, 2.0782761573791504, 2.2917635440826416, 2.3844709396362305, 2.207562208175659, 2.6992037296295166, 2.508577346801758, 1.8343913555145264, 1.964342713356018, 2.0580546855926514, 2.2156851291656494, 2.225208282470703, 2.7382521629333496, 1.8644499778747559, 1.7668579816818237, 2.3303263187408447, 1.650081992149353, 1.7673606872558594, 2.0443103313446045, 2.1215085983276367, 2.3665449619293213, 1.913568139076233, 2.1283137798309326, 2.082343339920044, 1.845554232597351, 2.1306371688842773, 2.0943191051483154, 1.6559094190597534, 1.874361515045166, 2.196915864944458, 1.6468579769134521, 1.911605715751648, 1.8386603593826294, 1.651654839515686, 1.7668428421020508, 2.1025173664093018, 2.5371224880218506, 2.542149066925049, 1.9123281240463257, 2.07132887840271, 2.104313850402832, 2.1970953941345215, 1.747537612915039, 2.1680679321289062, 1.6049259901046753, 1.8144755363464355, 1.607206106185913, 2.1051182746887207, 2.30176043510437, 2.055739641189575, 1.7960200309753418, 2.2341713905334473, 2.2772796154022217, 1.5942180156707764, 1.4161475896835327, 1.768833875656128, 1.698912501335144, 2.5106852054595947, 2.2622127532958984, 2.248725652694702, 1.3526500463485718, 2.0672600269317627, 1.2858622074127197, 2.209092855453491, 1.9085062742233276, 2.164461374282837, 1.6942768096923828, 1.8219777345657349, 1.6654281616210938, 1.3840452432632446, 2.021466016769409, 2.185014009475708, 2.7005491256713867, 2.054861068725586, 2.290797472000122, 1.819451928138733, 1.7095788717269897, 2.08370041847229, 1.8540825843811035, 1.5361939668655396, 1.9344009160995483, 2.011833429336548, 1.8421047925949097, 1.745201587677002, 1.738987684249878, 2.3416826725006104, 1.819770097732544, 1.9555068016052246, 1.6617323160171509, 1.7632275819778442, 1.6801882982254028, 1.6176601648330688, 1.8643062114715576, 2.391148567199707, 2.026151657104492, 1.3551452159881592, 1.9444674253463745, 2.030042886734009, 1.4928719997406006, 2.1964924335479736, 1.721260666847229, 1.6323721408843994, 1.8934358358383179, 1.6503790616989136, 1.810072422027588, 1.6806635856628418, 1.7436854839324951, 2.0639781951904297, 1.7644917964935303, 2.0846970081329346, 1.7320995330810547, 1.9344594478607178, 1.8328373432159424, 1.4859752655029297, 1.4207582473754883, 1.7028613090515137, 1.3096951246261597, 2.062412977218628, 1.6682759523391724, 1.9851412773132324, 1.8659499883651733, 2.023029088973999, 1.8812261819839478, 1.650010347366333, 2.24251127243042, 1.7022038698196411, 1.9663957357406616, 2.0233025550842285, 1.7759249210357666, 1.7913755178451538, 1.7031233310699463, 2.1263468265533447, 1.8155795335769653, 2.009910821914673, 0.7485235929489136, 2.037891149520874, 1.948815941810608, 2.092813014984131, 1.4448872804641724, 1.4182137250900269, 1.6787669658660889, 1.8369914293289185, 1.7910382747650146, 1.5487098693847656, 1.6990699768066406, 1.8390051126480103, 1.6980698108673096, 1.8745864629745483, 1.4678798913955688, 2.0415091514587402, 1.9430547952651978, 1.7604386806488037, 1.7054890394210815, 1.9707624912261963, 1.6506967544555664, 1.7295200824737549, 2.119673013687134, 1.781699776649475, 1.872870922088623, 1.9287470579147339, 2.3849129676818848, 1.3907711505889893, 2.0317656993865967, 2.246182441711426, 1.6630584001541138, 1.7366833686828613, 1.893466591835022, 1.9994508028030396, 1.6928390264511108, 2.286558151245117, 1.8601902723312378, 2.074193239212036, 1.9644376039505005, 2.222168207168579, 2.01521635055542, 1.902417540550232, 1.9433950185775757, 1.3333890438079834, 1.467800259590149, 1.9438259601593018, 1.7504549026489258, 1.7167820930480957, 2.393507242202759, 1.716629981994629, 1.6445847749710083, 1.8922662734985352, 1.8366024494171143, 2.8078978061676025, 1.9533857107162476, 1.8897480964660645, 1.7562339305877686, 1.7307255268096924, 1.8811894655227661, 1.410526990890503, 1.544959545135498, 1.9512447118759155, 1.5336806774139404, 1.778146505355835, 2.5612223148345947, 1.602870225906372, 1.6649028062820435, 1.8742258548736572, 1.9582204818725586, 2.8062386512756348, 2.0969276428222656, 1.9959490299224854, 1.596706748008728, 1.8671170473098755, 1.644322395324707, 1.4971541166305542, 1.6446446180343628, 2.2338953018188477, 1.6610277891159058, 1.8155239820480347, 1.8613330125808716, 2.124269962310791, 1.6685959100723267, 2.027803659439087, 1.8879027366638184, 1.6656299829483032, 2.3594648838043213, 2.131667375564575, 1.930503249168396, 1.9071769714355469, 1.3411979675292969, 2.1510672569274902, 1.8360323905944824, 1.7811588048934937, 1.8861175775527954, 2.0446455478668213, 2.0680153369903564, 2.4507741928100586, 2.037418842315674, 1.9370249509811401, 1.6014819145202637, 2.169235944747925, 1.8962119817733765, 2.22446870803833, 2.056671380996704, 2.8096721172332764, 2.1650896072387695, 1.6421505212783813, 2.0844311714172363, 2.035255193710327, 1.9424325227737427, 2.0809226036071777, 2.058319330215454, 2.3667778968811035, 1.633162260055542, 1.7297037839889526, 1.924973487854004, 2.1184751987457275, 1.4286530017852783, 2.0660717487335205, 1.6737422943115234, 1.791200876235962, 2.1061809062957764, 1.694649338722229, 1.7883639335632324, 1.2417211532592773, 1.9337865114212036, 1.799373745918274, 1.716340184211731, 1.647995948791504, 2.095583438873291, 1.6550767421722412, 1.3155806064605713, 1.4891682863235474, 1.6479413509368896, 2.0285043716430664, 1.5157254934310913, 3.4038400650024414, 1.7236661911010742, 2.0299923419952393, 1.0690066814422607, 2.3372561931610107, 1.786926031112671, 3.301856756210327, 1.7599784135818481, 2.0298383235931396, 2.071105718612671, 1.697176456451416, 1.6525410413742065, 1.7468382120132446, 1.5919550657272339, 2.533473253250122, 1.7144138813018799]}}, 'meta': {'lr': {'steps': [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504, 506, 508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530, 532, 534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554, 556, 558, 560, 562, 564, 566, 568, 570, 572, 574, 576], 'values': [9.999702525000749e-05, 9.998810135399546e-05, 9.997322937381829e-05, 9.99524110790929e-05, 9.992564894698816e-05, 9.989294616193017e-05, 9.985430661522333e-05, 9.980973490458728e-05, 9.975923633360985e-05, 9.970281691111598e-05, 9.964048335045275e-05, 9.957224306869053e-05, 9.949810418574039e-05, 9.941807552338804e-05, 9.933216660424395e-05, 9.924038765061042e-05, 9.914274958326505e-05, 9.903926402016153e-05, 9.892994327504693e-05, 9.881480035599667e-05, 9.869384896386668e-05, 9.856710349066307e-05, 9.843457901782967e-05, 9.829629131445342e-05, 9.815225683538814e-05, 9.800249271929645e-05, 9.784701678661045e-05, 9.768584753741134e-05, 9.751900414922805e-05, 9.73465064747553e-05, 9.716837503949127e-05, 9.698463103929542e-05, 9.67952963378663e-05, 9.660039346413994e-05, 9.639994560960923e-05, 9.619397662556435e-05, 9.598251102025461e-05, 9.576557395597236e-05, 9.554319124605879e-05, 9.53153893518325e-05, 9.508219537944081e-05, 9.484363707663442e-05, 9.459974282946571e-05, 9.435054165891109e-05, 9.409606321741775e-05, 9.38363377853754e-05, 9.357139626751308e-05, 9.330127018922194e-05, 9.302599169280395e-05, 9.274559353364734e-05, 9.246010907632895e-05, 9.21695722906443e-05, 9.18740177475654e-05, 9.157348061512727e-05, 9.126799665424319e-05, 9.09576022144496e-05, 9.064233422958077e-05, 9.032223021337414e-05, 8.999732825500648e-05, 8.966766701456177e-05, 8.933328571843084e-05, 8.899422415464409e-05, 8.865052266813685e-05, 8.83022221559489e-05, 8.79493640623581e-05, 8.759199037394887e-05, 8.723014361461632e-05, 8.68638668405062e-05, 8.649320363489179e-05, 8.611819810298778e-05, 8.573889486670233e-05, 8.535533905932738e-05, 8.496757632016836e-05, 8.457565278911348e-05, 8.417961510114356e-05, 8.377951038078302e-05, 8.337538623649237e-05, 8.296729075500344e-05, 8.255527249559746e-05, 8.213938048432697e-05, 8.171966420818228e-05, 8.129617360920296e-05, 8.086895907853526e-05, 8.043807145043604e-05, 8.000356199622405e-05, 7.956548241817912e-05, 7.912388484339012e-05, 7.86788218175523e-05, 7.823034629871503e-05, 7.777851165098012e-05, 7.732337163815217e-05, 7.68649804173412e-05, 7.64033925325184e-05, 7.593866290802608e-05, 7.54708468420421e-05, 7.500000000000001e-05, 7.45261784079654e-05, 7.404943844596939e-05, 7.35698368412999e-05, 7.308743066175172e-05, 7.2602277308836e-05, 7.211443451095007e-05, 7.162396031650831e-05, 7.113091308703498e-05, 7.063535149021973e-05, 7.013733449293687e-05, 6.96369213542287e-05, 6.91341716182545e-05, 6.862914510720515e-05, 6.812190191418508e-05, 6.761250239606169e-05, 6.710100716628344e-05, 6.658747708766762e-05, 6.607197326515808e-05, 6.555455703855454e-05, 6.503528997521366e-05, 6.451423386272312e-05, 6.399145070154961e-05, 6.346700269766132e-05, 6.294095225512603e-05, 6.241336196868582e-05, 6.188429461630866e-05, 6.135381315171867e-05, 6.0821980696905146e-05, 6.0288860534611745e-05, 5.9754516100806423e-05, 5.9219010977133173e-05, 5.868240888334653e-05, 5.814477366972945e-05, 5.7606169309495836e-05, 5.706665989117839e-05, 5.6526309611002594e-05, 5.5985182765248126e-05, 5.544334374259823e-05, 5.490085701647805e-05, 5.435778713738292e-05, 5.381419872519763e-05, 5.327015646150716e-05, 5.2725725081900325e-05, 5.218096936826681e-05, 5.1635954141088813e-05, 5.1090744251728064e-05, 5.054540457470912e-05, 5e-05, 4.945459542529089e-05, 4.890925574827195e-05, 4.83640458589112e-05, 4.781903063173321e-05, 4.727427491809968e-05, 4.6729843538492847e-05, 4.618580127480238e-05, 4.564221286261709e-05, 4.509914298352197e-05, 4.4556656257401786e-05, 4.4014817234751885e-05, 4.347369038899744e-05, 4.2933340108821644e-05, 4.239383069050417e-05, 4.185522633027057e-05, 4.131759111665349e-05, 4.078098902286683e-05, 4.0245483899193595e-05, 3.971113946538826e-05, 3.917801930309486e-05, 3.864618684828134e-05, 3.8115705383691355e-05, 3.758663803131418e-05, 3.705904774487396e-05, 3.65329973023387e-05, 3.60085492984504e-05, 3.5485766137276894e-05, 3.4964710024786354e-05, 3.4445442961445464e-05, 3.392802673484193e-05, 3.341252291233241e-05, 3.289899283371657e-05, 3.2387497603938326e-05, 3.1878098085814924e-05, 3.137085489279485e-05, 3.086582838174551e-05, 3.0363078645771303e-05, 2.9862665507063147e-05, 2.936464850978027e-05, 2.886908691296504e-05, 2.8376039683491686e-05, 2.7885565489049946e-05, 2.7397722691164018e-05, 2.6912569338248315e-05, 2.6430163158700115e-05, 2.595056155403063e-05, 2.54738215920346e-05, 2.500000000000001e-05, 2.4529153157957913e-05, 2.4061337091973918e-05, 2.3596607467481603e-05, 2.3135019582658802e-05, 2.2676628361847836e-05, 2.2221488349019903e-05, 2.176965370128498e-05, 2.132117818244771e-05, 2.08761151566099e-05, 2.0434517581820896e-05, 1.999643800377596e-05, 1.9561928549563968e-05, 1.913104092146476e-05, 1.8703826390797048e-05, 1.8280335791817733e-05, 1.7860619515673033e-05, 1.7444727504402553e-05, 1.703270924499656e-05, 1.662461376350764e-05, 1.622048961921699e-05, 1.5820384898856434e-05, 1.5424347210886538e-05, 1.5032423679831642e-05, 1.4644660940672627e-05, 1.4261105133297692e-05, 1.3881801897012225e-05, 1.3506796365108232e-05, 1.3136133159493802e-05, 1.2769856385383688e-05, 1.2408009626051137e-05, 1.2050635937641908e-05, 1.1697777844051105e-05, 1.134947733186315e-05, 1.100577584535592e-05, 1.0666714281569151e-05, 1.0332332985438248e-05, 1.000267174499352e-05, 9.677769786625867e-06, 9.357665770419244e-06, 9.042397785550405e-06, 8.732003345756811e-06, 8.426519384872733e-06, 8.125982252434611e-06, 7.830427709355725e-06, 7.539890923671062e-06, 7.2544064663526815e-06, 6.974008307196056e-06, 6.698729810778065e-06, 6.428603732486937e-06, 6.163662214624616e-06, 5.903936782582253e-06, 5.649458341088915e-06, 5.400257170534295e-06, 5.156362923365588e-06, 4.917804620559202e-06, 4.684610648167503e-06, 4.456808753941205e-06, 4.234426044027645e-06, 4.017488979745387e-06, 3.8060233744356633e-06, 3.600054390390778e-06, 3.3996065358600782e-06, 3.2047036621337236e-06, 3.0153689607045845e-06, 2.8316249605087386e-06, 2.653493525244721e-06, 2.4809958507719444e-06, 2.314152462588659e-06, 2.152983213389559e-06, 1.99750728070357e-06, 1.8477431646118648e-06, 1.70370868554659e-06, 1.565420982170346e-06, 1.4328965093369283e-06, 1.3061510361333185e-06, 1.1851996440033319e-06, 1.0700567249530834e-06, 9.607359798384785e-07, 8.572504167349449e-07, 7.596123493895991e-07, 6.678333957560512e-07, 5.81924476611967e-07, 5.018958142596065e-07, 4.277569313094809e-07, 3.59516649547248e-07, 2.971830888840177e-07, 2.407636663901591e-07, 1.9026509541272275e-07, 1.4569338477666838e-07, 1.0705383806982606e-07, 7.43510530118452e-08, 4.7588920907110094e-08, 2.6770626181715773e-08, 1.189864600454338e-08, 2.974749992512571e-09, 0.0]}}, 'validation': {'BLEU': {'steps': [576], 'values': [3.3102839611651578]}}, 'df': {'train_data': '/home/shperling/h2o-llmstudio/output/user/venomous-emu/batch_viz.parquet', 'validation_predictions': '/home/shperling/h2o-llmstudio/output/user/venomous-emu/validation_viz.parquet'}, 'internal': {'total_training_steps': {'steps': [0], 'values': [576.0]}, 'total_validation_steps': {'steps': [0], 'values': [6.0]}, 'global_start_time': {'steps': [0], 'values': [1710242720.4434388]}, 'current_step': {'steps': [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 422, 424, 426, 428, 430, 432, 434, 436, 438, 440, 442, 444, 446, 448, 450, 452, 454, 456, 458, 460, 462, 464, 466, 468, 470, 472, 474, 476, 478, 480, 482, 484, 486, 488, 490, 492, 494, 496, 498, 500, 502, 504, 506, 508, 510, 512, 514, 516, 518, 520, 522, 524, 526, 528, 530, 532, 534, 536, 538, 540, 542, 544, 546, 548, 550, 552, 554, 556, 558, 560, 562, 564, 566, 568, 570, 572, 574, 576], 'values': [2.0, 4.0, 6.0, 8.0, 10.0, 12.0, 14.0, 16.0, 18.0, 20.0, 22.0, 24.0, 26.0, 28.0, 30.0, 32.0, 34.0, 36.0, 38.0, 40.0, 42.0, 44.0, 46.0, 48.0, 50.0, 52.0, 54.0, 56.0, 58.0, 60.0, 62.0, 64.0, 66.0, 68.0, 70.0, 72.0, 74.0, 76.0, 78.0, 80.0, 82.0, 84.0, 86.0, 88.0, 90.0, 92.0, 94.0, 96.0, 98.0, 100.0, 102.0, 104.0, 106.0, 108.0, 110.0, 112.0, 114.0, 116.0, 118.0, 120.0, 122.0, 124.0, 126.0, 128.0, 130.0, 132.0, 134.0, 136.0, 138.0, 140.0, 142.0, 144.0, 146.0, 148.0, 150.0, 152.0, 154.0, 156.0, 158.0, 160.0, 162.0, 164.0, 166.0, 168.0, 170.0, 172.0, 174.0, 176.0, 178.0, 180.0, 182.0, 184.0, 186.0, 188.0, 190.0, 192.0, 194.0, 196.0, 198.0, 200.0, 202.0, 204.0, 206.0, 208.0, 210.0, 212.0, 214.0, 216.0, 218.0, 220.0, 222.0, 224.0, 226.0, 228.0, 230.0, 232.0, 234.0, 236.0, 238.0, 240.0, 242.0, 244.0, 246.0, 248.0, 250.0, 252.0, 254.0, 256.0, 258.0, 260.0, 262.0, 264.0, 266.0, 268.0, 270.0, 272.0, 274.0, 276.0, 278.0, 280.0, 282.0, 284.0, 286.0, 288.0, 290.0, 292.0, 294.0, 296.0, 298.0, 300.0, 302.0, 304.0, 306.0, 308.0, 310.0, 312.0, 314.0, 316.0, 318.0, 320.0, 322.0, 324.0, 326.0, 328.0, 330.0, 332.0, 334.0, 336.0, 338.0, 340.0, 342.0, 344.0, 346.0, 348.0, 350.0, 352.0, 354.0, 356.0, 358.0, 360.0, 362.0, 364.0, 366.0, 368.0, 370.0, 372.0, 374.0, 376.0, 378.0, 380.0, 382.0, 384.0, 386.0, 388.0, 390.0, 392.0, 394.0, 396.0, 398.0, 400.0, 402.0, 404.0, 406.0, 408.0, 410.0, 412.0, 414.0, 416.0, 418.0, 420.0, 422.0, 424.0, 426.0, 428.0, 430.0, 432.0, 434.0, 436.0, 438.0, 440.0, 442.0, 444.0, 446.0, 448.0, 450.0, 452.0, 454.0, 456.0, 458.0, 460.0, 462.0, 464.0, 466.0, 468.0, 470.0, 472.0, 474.0, 476.0, 478.0, 480.0, 482.0, 484.0, 486.0, 488.0, 490.0, 492.0, 494.0, 496.0, 498.0, 500.0, 502.0, 504.0, 506.0, 508.0, 510.0, 512.0, 514.0, 516.0, 518.0, 520.0, 522.0, 524.0, 526.0, 528.0, 530.0, 532.0, 534.0, 536.0, 538.0, 540.0, 542.0, 544.0, 546.0, 548.0, 550.0, 552.0, 554.0, 556.0, 558.0, 560.0, 562.0, 564.0, 566.0, 568.0, 570.0, 572.0, 574.0, 576.0]}, 'current_val_step': {'steps': [2, 4, 6], 'values': [2.0, 4.0, 6.0]}, 'epoch': {'steps': [576], 'values': [1.0]}}} experiment/display/refresh: False experiment/display/download_logs: False experiment/display/download_predictions: False experiment/display/download_model: True experiment/display/push_to_huggingface: False experiment/list/current: False experiment/display/validation_prediction_insights: True experiment/display/charts/df_validation_predictions: [] home: False report_error: True q.events q.args report_error: True wave_submission_name: report_error stacktrace Traceback (most recent call last):

File “/home/shperling/h2o-llmstudio/./llm_studio/app_utils/handlers.py”, line 332, in handle await experiment_download_model(q)

File “/home/shperling/h2o-llmstudio/./llm_studio/app_utils/sections/experiment.py”, line 1627, in experiment_download_model cfg, model, tokenizer = load_cfg_model_tokenizer(

File “/home/shperling/h2o-llmstudio/./llm_studio/app_utils/sections/chat.py”, line 201, in load_cfg_model_tokenizer load_checkpoint(cfg, model, strict=False)

File “/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py”, line 237, in load_checkpoint model = load_model_weights(model, model_weights, strict, cfg)

File “/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py”, line 153, in load_model_weights model_weights = {

File “/home/shperling/h2o-llmstudio/./llm_studio/src/utils/modeling_utils.py”, line 160, in else model_state_dict[k]

KeyError: ‘backbone.base_model.model.model.layers.0.self_attn.q_proj.base_layer.weight.quant_state.bitsandbytes__nf4’

Error None

Git Version d47ee8fc0806202f08f97f943470a0e2c56e0c13

To Reproduce

LLM Studio version

psinger commented 5 months ago

Hi @vgorovoy - I just opened a PR to fix a similar issue, it might also solve yours: https://github.com/h2oai/h2o-llmstudio/pull/638

Could you please try that?

vgorovoy commented 5 months ago

Hi @psinger ! I've tried it and it worked. Thanks a lot!