cache-warmup Search Results

1000+ results
for cache-warmup

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #137844

> if graph capture is thread local

> if graph capture is thread local Graph capture is [initiated on a Cuda stream](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html#group__CUDART__STREAM_1g793d7…

FuncJ updated 1 day ago
3
doctrine/orm #6209

Conversion of object query parameters to their identifiers i…

The detection of whether the object is an entity uses ``$this->_em->getMetadataFactory()->hasMetadataFor(ClassUtils::getClass($value))``. But ``hasMetadataFor`` checks whether the factory has **loaded…

stof updated 5 years ago
6
huggingface/optimum-intel #339

OVModelForSeq2SeqLM with Helsinki-NLP/opus-mt-es-en has slow…

I'm having trouble exporting the `Helsinki-NLP/opus-mt-es-en` model for language translation into the optimised OpenVino IR format. Reading through the other issues within this repository highlighted …

tsmith023 updated 1 year ago
2
apache/mxnet #17357

[BUG] lr_scheduler does not work as expect when training fro…

## Description As I know, the optimizer decides the num_update according to its _index_update_count saved on each device, which means that If the trainer states on one GPU device and loaded into anot…

kohillyang updated 4 years ago
3
pytorch/pytorch #134913

torch.compile makes model slower

### 🐛 Describe the bug NOTE: we are only interested in compiling the decoder, the encoder is also shown in the time traces but should be ignored. I've been trying for a long time to get `torch.c…

alita-moore updated 1 month ago
8
microsoft/DeepSpeed #2790

[BUG] Create zero equivalency unit test

Starting point: https://github.com/microsoft/DeepSpeed/issues/966 Test matrix 1. gradient accumulation: one vs many 2. #gpus: one vs many 3. stages: 1 vs 2 vs 3 4. dtype: bf16 vs fp16 vs fp32 …

tjruwase updated 6 days ago
6
hiyouga/LLaMA-Factory #5303

adam-mini is not compatible with deepspeed

### Reminder - [X] I have read the README and searched the existing issues. ### System Info when I just add one line in the `examples/extras/adam_mini/qwen2_full_sft.yaml` got a error below. ```…

muziyongshixin updated 3 weeks ago
3
neulab/awesome-align #47

Trying to train on an existing model

Hi! This is a really great tool and it's been fun using it. I am trying to train the model 'bert-base-multilingual-uncased' using a tokenized dataset in the correct format. But every time I run the…

ghost updated 2 years ago
1
modelscope/ms-swift #2117

集成多类损失函数的sft训练（如对比损失）

**Describe the feature** 提供多种损失函数的sft训练，比如对比损失 **Paste any useful information** sft时，除了交叉熵损失，有时需要针对某个特定token计算对比损失、pairloss等等，可否集成这样一个功能呢？ **Additional context**

YasmineXXX updated 2 weeks ago
5
intel-analytics/ipex-llm #10309

BIGDL-LM Acceleration for chatglm3-6b

I’ve been using BIGDL-LM to accelerate the chatglm3-6b model. However, I’m curious about the speed. Is the current speed considered normal? Here are the hardware details: + Graphics Card: Intel Corp…

HuskyLYL updated 7 months ago
4

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for cache-warmup

1000+ results
for cache-warmup