-
1. finetune/finetune_deepseekcoder.py脚本中默认使用deepseek-coder-6.7b-instruct模型,请问在硬件支持的情况下是否也支持deepseek-coder-33b-instruct模型的训练。
2. 请教一下,如果需要训练代码相关的下游任务,coder-base模型和coder-instruct模型该如何选择?是否 coder-instru…
-
## Issue Title: use the finetune script but meet error
### Environment
- Platform: Ubuntu Linux
- GPU: A5000 x 8
- Torch Version: 2.1.2
- Transformers Version: 4.41.0.dev0
### Issue Description
…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing…
-
训练代码:
```
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch \
--config_file examples/accelerate/fsdp_config.yaml \
src/train_bash.py \
--stage sft \
--do_train \
--model_name_…
-
### What is the issue?
Attached log: [llama3.2-cuda-oom.log](https://github.com/user-attachments/files/17582524/llama3.2-cuda-oom.log)
I'm testing the `x/llama3.2-vision:11b-instruct-q4_K_M` and…
-
### What is the issue?
I have tools that automatically update my containers. I use latest with ollama. after latest update to image i cant run any models. I can pull models for example llama3.1 but w…
-
### Model introduction
The model was created by Tongyi Qianwen. It is an instruct finetune of their CodeQwen1.5-7B, which is a coding finetune of their Qwen1.5-7B.
### Model URL
https://huggingface…
-
Hello,
I'm curious about the reason to use base models instead of instructed ones? Especially that the proprietary ones have been through this finetuning.
Any reason why?
-
device: Intel Arc A770 & MTL iGfx
bigdl-core: 2.5.0b20240811
transformers: 4.37.0
model: finetuned Qwen2-1.5B
ipex-llm generate the same error with CPU/GPU inference. The model runs ok with NV g…
-
## User story
1. As a Software Developer
2. I need to select a Large Language Base Model based on requirements of the project by a LLM selection process
3. So that our created LLM will perform best.
…