-
Hello, is there any way to run a inference with 2 or more GPUs?
-
### System Info
Python version: 3.10.12
Pytorch version:
llama_models version: 0.0.42
llama_stack version: 0.0.42
llama_stack_client version: 0.0.41
Hardware: 4xA100 (40GB VRAM/GPU)
local-…
-
- [ ] Integrate in Bianca Slurm pages
- [ ] links to these new pages, from Bianca pages
- [ ] Integrate in Snowy Slurm pages
- [ ] Links to these new pages, i.e from relevant/connect Snowy pages
…
-
Hi,
I am simulating a quantum dynamical system using the great ITensorMPS.jl package.
(https://github.com/ITensor/ITensorMPS.jl)
Without getting into details about this package and the specific com…
-
**Idea:**
Cast FP32/FP16 to BF16.
Casting will be different based on type:
- FP32 to BF16: truncate last 16 bits from mantissa, exponent stays the same
- FP16 to BF16: more involved process --…
-
### Description
We're interested in some form of GPU checkpointing - is this something that the gvisor team plans on supporting at any point?
Generally, existing GPU checkpointing implementations de…
-
-
Hi, I noticed #1500, thank you for this contribution!
I noticed that the CUDA version required for this build is CUDA 10.x, which I struggle to install specifically on Amazon Linux 2023 (for use wi…
-
### Describe the Bug
Noticing GPU spikes in this current build, and an inability to actually get into my first world as an "Invalid Player Data" error. I also have the latest drivers available. Just …
-
### Bug
return forward_call(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File …