-
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the commu…
-
# URL
- https://arxiv.org/abs/2309.15427
# Affiliations
- Yijun Tian, N/A
- Huan Song, N/A
- Zichen Wang, N/A
- Haozhu Wang, N/A
- Ziqing Hu, N/A
- Fang Wang, N/A
- Nitesh V. Chawla, N/A…
-
### 🐛 Describe the bug
First, SIMPLE_MODEL is not properly imported in the given starter code.
Second, I'm having an issue to run addLoader function in paid model section. The error message is shown…
-
I have over the last months a problem that appears from time to time, more or less often, but always after the context history got on its limits. I always thought that it would be an LLM issue until I…
-
**Scenario:**
- completed the fine tune on 'Weyaxi/Dolphin2.1-OpenOrca-7B' using ipex-llm on gpu max 1100
- output directory look like as below with checkpoints and config file.
-
![image](https…
-
![image](https://github.com/InternLM/tutorial/assets/137043350/8989a0f0-4d30-4a63-a238-4568c75bdee0)
max_length = 2048
pack_to_max_length = True
# Scheduler & Optimizer
batch_size = 1 # per_dev…
-
run step3 with:
deepspeed --master_port 12346 DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py \
--data_path wangrui6/Zhihu-KOL \
--data_split 2,4,4 \
…
-
Hi guys, thanks for open-sourcing this great work!
It seems LLama3 is using “right” padding and using “eos_token“ as the “padding_token”. Could you help verify that if I want to train this model, wh…
-
ROCm/triton, ROCm/flash-attention or the fmha ck implementation?
-
In summary, this “vulnerability” is problematic because it mostly doesn’t represent a root cause, but a result or symptom. In the 2021 OWASP Top 10, they reoriented from symptoms to root causes. They…