ai-training Search Results

1000+ results
for ai-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TransformerEngine #1014

AttributeError: module 'transformer_engine' has no attribute…

I reinstall `pip install flash-attn==2.6.1` in NGC pytorch docker image 24.06. When I run train job, I got follow error: ``` Traceback (most recent call last): File "/data1/nfs15/nfs/bigdata/zha…

Lzhang-hub updated 1 month ago
4
Ahmet-Dedeler/ai-llm-comparison #2

Add More Model Comparisons

The current repository provides comparisons of various AI language models. However, there are several recent models and unique architectures that are not included in the existing comparisons. Expandin…

shwetd19 updated 1 month ago
1
huggingface/trl #1975

Training on Teacher model logits

### Feature request Is there a possibility to add training on bigger model logits It's a question of training on logits instead of one-hot vectors from dataset text ### Motivation DistillKit slows…

Theodotus1243 updated 3 weeks ago
5
chonger/cueballer #7

Automatic formatting

Hey folks so here is my recommended gameplan for our goal to be able to take arbitrarily formatted scripts and convert them all to the same format of our choosing. We will begin in integrate our solu…

chonger updated 4 days ago
2
hashicorp/terraform-provider-google #20216

Support for Managing Vertex AI Models in BigQuery

### Community Note * Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help the commu…

yu-iskw updated 1 week ago
1
modelscope/ms-swift #2391

Fine tuning stalling

I am attempting to use the fine tuning with my custom dataset, however the training percentage value keeps staying at 0% and not increasing at all, after 20h of running time: ``` Train: 0%| …

ep0p updated 6 days ago
6
openvinotoolkit/model_server #2760

LLM Serving Incompatible with OPENAI API: /v1/chat/completi…

**Describe the bug** OpenAI API endpoint is "/v1/chat/completions", but OVMS endpoint is "/v3/chat/completions". most of existing application doesn't allow user to modify the prefix “**V1**” to "**…

alexgang updated 14 hours ago
3
longhorn/longhorn #8150

[BUG] Stucking hang occurs when repeatly reading files

I'm using longhorn v1.6.0 and I create the volume with replica 2 I am training an AI model by reading image files from a Longhorn volume, and recently, the training often hangs unexpectedly. I …

ziippy updated 2 weeks ago
3
ContextualAI/gritlm #65

CUDA OOM when finetuning meta-llama/Meta-Llama-3-8B-Instruct

I was trying to finetuning Meta-Llama-3-8B-Instruct using 4 gpus with the following command: `torchrun --nproc_per_node 4 -m training.run --output_dir llama3test --model_name_or_path meta-llama/Met…

zhj2022 updated 4 days ago
1
best-practice-and-impact/aqua_book_revision #261

Review: Black Box Model comments

These comments on the sections related to black box models made on the version which was live on Friday 8 November 2024. The sections outlined below are what was covered. **Definitions** - [ ] A…

lmadavies updated 2 days ago
1

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for ai-training

1000+ results
for ai-training