instruct-finetune Search Results

1000+ results
for instruct-finetune

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

deepseek-ai/DeepSeek-Coder #140

官方提供的微调训练脚本是否支持33B模型训练？(及训练相关问题)

1. finetune/finetune_deepseekcoder.py脚本中默认使用deepseek-coder-6.7b-instruct模型，请问在硬件支持的情况下是否也支持deepseek-coder-33b-instruct模型的训练。 2. 请教一下，如果需要训练代码相关的下游任务，coder-base模型和coder-instruct模型该如何选择？是否 coder-instru…

tongyuhome updated 7 months ago
1
mbzuai-oryx/LLaVA-pp #24

finetune error about model size

## Issue Title: use the finetune script but meet error ### Environment - Platform: Ubuntu Linux - GPU: A5000 x 8 - Torch Version: 2.1.2 - Transformers Version: 4.41.0.dev0 ### Issue Description …

Skylight-Lark updated 6 months ago
5
OpenBMB/MiniCPM-V #212

lora微调grad_norm为nan，loss为0[BUG] <title>

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing…

tayton42 updated 2 months ago
6
hiyouga/LLaMA-Factory #3620

fsdp微调llama3-70b后推理结果完全错误

训练代码： ``` CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch \ --config_file examples/accelerate/fsdp_config.yaml \ src/train_bash.py \ --stage sft \ --do_train \ --model_name_…

ben-8878 updated 6 months ago
4
ollama/ollama #7440

[v0.4.0-rc6] CUDA OOM using x/llama3.2-vision:11b-instruct

### What is the issue? Attached log: [llama3.2-cuda-oom.log](https://github.com/user-attachments/files/17582524/llama3.2-cuda-oom.log) I'm testing the `x/llama3.2-vision:11b-instruct-q4_K_M` and…

thatjpk updated 2 weeks ago
10
ollama/ollama #6277

Ollama Latest (0.3.4) Will not run models

### What is the issue? I have tools that automatically update my containers. I use latest with ollama. after latest update to image i cant run any models. I can pull models for example llama3.1 but w…

awptechnologies updated 2 months ago
9
evalplus/evalplus #163

🤗 [REQUEST] - CodeQwen1.5-7B-Chat

### Model introduction The model was created by Tongyi Qianwen. It is an instruct finetune of their CodeQwen1.5-7B, which is a coding finetune of their Qwen1.5-7B. ### Model URL https://huggingface…

ethanc8 updated 7 months ago
4
TIGER-AI-Lab/LongICLBench #1

[question] Why did you use base model instead of instructed …

Hello, I'm curious about the reason to use base models instead of instructed ones? Especially that the proprietary ones have been through this finetuning. Any reason why?

remiconnesson updated 7 months ago
1
intel-analytics/ipex-llm #11800

Inference produced repetitive and erroneous output by a fint…

device: Intel Arc A770 & MTL iGfx bigdl-core: 2.5.0b20240811 transformers: 4.37.0 model: finetuned Qwen2-1.5B ipex-llm generate the same error with CPU/GPU inference. The model runs ok with NV g…

jianjungu updated 3 months ago
4
amosproj/amos2024ss08-cloud-native-llm #19

Conduct Largue Langage Base Model Selection Process

## User story 1. As a Software Developer 2. I need to select a Large Language Base Model based on requirements of the project by a LLM selection process 3. So that our created LLM will perform best. …

grayJiaaoLi updated 6 months ago
1

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for instruct-finetune

1000+ results
for instruct-finetune