llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

hashicorp/terraform-provider-google #20216

Support for Managing Vertex AI Models in BigQuery

### Community Note * Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the commu…

yu-iskw updated 2 weeks ago
1
AkihikoWatanabe/paper_notes #1061

Graph Neural Prompting with Large Language Models, Yijun Tia…

# URL - https://arxiv.org/abs/2309.15427 # Affiliations - Yijun Tian, N/A - Huan Song, N/A - Zichen Wang, N/A - Haozhu Wang, N/A - Ziqing Hu, N/A - Fang Wang, N/A - Nitesh V. Chawla, N/A…

AkihikoWatanabe updated 1 year ago
2
llm-tools/embedJs #162

Getting Started code not runnable

### 🐛 Describe the bug First, SIMPLE_MODEL is not properly imported in the given starter code. Second, I'm having an issue to run addLoader function in paid model section. The error message is shown…

ericljx2020-gmail updated 6 days ago
2
LostRuins/koboldcpp #760

AI seems to break sometimes - ContextShift bug?

I have over the last months a problem that appears from time to time, more or less often, but always after the context history got on its limits. I always thought that it would be an LLM issue until I…

Hotohori updated 7 months ago
2
intel-analytics/ipex-llm #11118

[inference]: fine tuned model fails to do inferencing

**Scenario:** - completed the fine tune on 'Weyaxi/Dolphin2.1-OpenOrca-7B' using ipex-llm on gpu max 1100 - output directory look like as below with checkpoints and config file. - ![image](https…

raj-ritu17 updated 6 months ago
1
InternLM/Tutorial #516

Internlm2-chat-7b在进行单轮对话微调后，灾难性遗忘，求帮助

![image](https://github.com/InternLM/tutorial/assets/137043350/8989a0f0-4d30-4a63-a238-4568c75bdee0) max_length = 2048 pack_to_max_length = True # Scheduler & Optimizer batch_size = 1 # per_dev…

Egber1t updated 8 months ago
1
microsoft/DeepSpeed #3418

[BUG]KeyError: 'attention_mask'

run step3 with: deepspeed --master_port 12346 DeepSpeedExamples/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py \ --data_path wangrui6/Zhihu-KOL \ --data_split 2,4,4 \ …

janglichao updated 1 year ago
3
meta-llama/llama3 #42

Question about tokenizer

Hi guys, thanks for open-sourcing this great work! It seems LLama3 is using “right” padding and using “eos_token“ as the “padding_token”. Could you help verify that if I want to train this model, wh…

odegeasslbc updated 7 months ago
2
facebookresearch/xformers #1060

Which kernel is best on MI200 for training?

ROCm/triton, ROCm/flash-attention or the fmha ck implementation?

npuichigo updated 5 months ago
5
OWASP/www-project-top-10-for-large-language-model-applications #140

Sensitive Information Disclosure

In summary, this “vulnerability” is problematic because it mostly doesn’t represent a root cause, but a result or symptom. In the 2021 OWASP Top 10, they reoriented from symptoms to root causes. They…

Bobsimonoff updated 1 year ago
9

上一页 1...74 75 76 77 78 79 80...100 下一页

1000+ results for llm-training

1000+ results
for llm-training