-
### What happened?
Launching
`HYDRA_FULL_ERROR=1 ANEMOI_BASE_SEED=1 anemoi-training train --config-name happy_little_config --config-dir=/pathToConfigs/config`
for training models in a multi-…
-
val_dataloader = dict(
batch_size=1,
dataset=dict(
ann_file='val/annotations_val.json',
backend_args=None,
data_prefix=dict(img='val/images/'),
data_root=…
-
**Describe the bug/ 问题描述 (Mandatory / 必填)**
LoRA微调Qwen2.5-3B模型时,训练阶段前10个step的速度比较快,能达到1~2s/step,随后逐渐减慢到10s/step以上,并且GPU的利用率在前期能达到100%,但在100个step之后就长时间地停在2%。
- **Hardware Environment(`Ascend`/`GPU`…
-
### System Info
```
~/work/llama-stack/distributions/meta-reference-gpu (main)]$ python -m "torch.utils.collect_env"
/home/kaiwu/.conda/envs/llamastack-meta-reference-gpu/lib/python3.10/runpy.py:12…
-
Issue Summary:
I am currently working on federated learning tasks using OpenFL on NVIDIA Jetson devices. However, I am facing an issue where the model training is not utilizing the GPU, despite usi…
-
we would like to request a small GPU python kernel for UC3 / UC4 under the EOX Lab environment which will further test headless execution functionality and the the provisioning of GPU resources to the…
-
NixOs nvidia-container-toolkit have been broken for some time now, atleast on my machine.
So when running zwift i get
```
❯ DEBUG=1 zwift
+ [[ -f /home/netbrain/.config/zwift/config ]]
+ ZWIFT…
-
Dear JOSS,
I submitted a paper about OPM Flow's new framework for Dune-compatible linear solvers on GPUs a month ago called `gpu-ISTL - Extending OPM Flow with GPU Linear Solvers` [(link)](https://jo…
-
Hi!
I am trying to setup a minimal example using the `block_gmres(A, B)` solver on a GPU. This should be possible with Julia 1.11 if I understand the documentation correctly. However, I get the error…
-
While exploring optimizations listed in the [documentation](https://huggingface.co/docs/diffusers/optimization/torch2.0), I find myself unable to free GPU memory after using `torch.compile` on a Stabl…