-
Hi NCCL team,
I am testing NCCL test on a H100 cluster with 3200Gbps networking. The allreduce performance is good on VMs, but it is much worse inside the container. I am using Slurm with enroot fo…
-
**Page** Project
Change ordering of sections to:
The preferred order of categories
Project
Wrangling
Submission steps
Post submission tracking
We feel a need to remove the star system (can …
-
The current filtering system relying on FMA xrefs is unsustainable & does not work for new terms with no FMA xref. If the aim of filtering is to remove irrelevant terms, a taxon slim is a much better…
-
Trying to set log2_uar_bar_megabytes = 7
[ 18.009803] mlx4_core 0004:02:00.0: Only 64 UAR pages (need more than 128)
[ 18.017063] mlx4_core 0004:02:00.0: Increase firmware log2_uar_bar_me…
-
Dear author, I noticed a bug in the loss calculation section of the file blip_fine_tune.py, from line 867 to 874. No matter the condition, the loss will be covered by contrastive_loss, which may cause…
-
Hello,
I am trying to run some MPI benchmarks with Sarus containers. In particular I am using OpenMPI 4.
Nodes are RDMA capable and have Infiniband. Everything works fine without the container and …
-
### Community Note
* Please vote on this issue by adding a 👍 [reaction](https://blog.github.com/2016-03-10-add-reactions-to-pull-requests-issues-and-comments/) to the original issue to help…
-
The BCSTM files created using the VGAudio library work fine when decoded on a computer, but fail to play on a 3DS, specifically when applied as a background theme in the HOME menu. I've tried comparin…
-
Currently hardcoded to be the ancient obo format version I made. Should be made the complete ontology now it is in OWL. obo-basic can be a subproduct
-
### Describe the bug
A clear and concise description of what the bug is.
_OMPI in latest public HPCX 2.7 when "--map by XXX" is used fails to startup on AMD nodes with IB. HPC_X 2.6 OMPI does not …