-
Single GPU training in Multi-GPU system doesn't work even if limited to 1 GPU with os.environ CUDA_VISIBLE_DEVICES before importing unsloth.
Reason:
check_nvidia function spawns new process to che…
Sehyo updated
2 months ago
-
We observed good overlap with FSDP + PGLE:
![Bq7PCuqyJbygSuL](https://github.com/user-attachments/assets/0cff27c4-6499-43d0-b436-ef01a2833ae0). Turning on and off PGLE makes a big difference here.
…
-
Triton does not officially support SM60 or SM61 GPUs anymore. This includes the datacenter P40, P100 and P102 cards the Quadro P5000 and the GTX1080 family.
https://github.com/triton-lang/triton/i…
-
I’m running the code on four NVIDIA 4090 GPUs (24GB each),but when loading the checkpoint , the process will be killed. But when I run the code on the NVIDIA 4090 GPUs rent by autoDL , it works well. …
-
The idea is perhaps future-looking, but I'd like to bring it up for discussion.
## Motivations
* Reduce the GPU/NPU memory required for completing a use case (e.g. text2image).
* Reduce the mem…
-
### 🐛 Describe the bug
Consistency check on ```torch.special.logit``` function between CPU and GPU using a bfloat16 tensor.
```python #
#include
#include
int main() {
std::cout
-
### Is there an existing issue for this problem?
- [X] I have searched the existing issues
### Operating system
Windows
### GPU vendor
Nvidia (CUDA)
### GPU model
rtx 3090
### GPU VRAM
24gb
…
-
After opening example "WidgetGalleryExample" for a period of time, Isaac Sim crashes and shows an error log:
2024-11-12 09:21:56 [78,874ms] [Warning] [omni.kit.xr.scene_view.utils.ui_container] Att…
-
Hi Guys,
First of all, thank you so much for sharing this amazing work. I have run the demo colab and got a good result.
To confirm, to run interference, cuda-enabled GPU is a must?
As #34 …
-
RRTMGP and RT have a mix of:
`radiation_rrtmgp.cu`:
Float* ph_g = thermo.get_basestate_fld_g("prefh");
Float p_top;
cudaMemcpy(&p_top, &ph_g[gd.kend], sizeof(TF), cudaMemcpyDeviceT…