-
Hi,
I set NCCL_ALGO=Tree and run the MNist example of horovod on 2 host machines, but it raises the error that no algorithm/protocol is available. I found the problem is related to the following co…
-
See #737
The above suggestion ought to be included among the other encoding fixes in next release.
The problem is that message text with e.g 'a with dots' in boxes like "You need to be a memb…
-
**Describe the bug**
I encountered an issue when using DeepSpeed 0.12.4 with the [OpenChat trainer](https://github.com/imoneoi/openchat), where checkpointing failed and raised an NCCL error. However,…
-
# Problem
The MPI Standard does not explicitly declare the threading guarantees of `MPI_User_function` when used in an `MPI_Op`. Specifically,
- (A) Is `MPI_User_function` required to be thre…
-
## Problem
Today's text explaining how payments work is not up to standards for open project management.
1. Open Collective is in the business of fiscal sponsorship, and supporting projects w…
-
### Summary
Add new functionality to Terragrunt so that module outputs, local variables and arbitrary string data (I'll collectively refer to these as artifacts until a better name comes around) ca…
-
[I previously wrote](https://correct-computation.slack.com/archives/G01GKGKHMFD/p1611812462009100):
> Based on my experience developing the canWrite constraints PR (#391) as a newcomer, I'm startin…
-
Currently in `@t3-oss/env-nextjs` you are meant to define your env as follows:
```ts
const env = createEnv({
server: {
FOO: z.string().required()
},
client: {
BAR: z.string().re…
-
I am trying to do a chunked, compressed parallel write using h5py. My test script is a slightly modified version of the [one from the docs](http://docs.h5py.org/en/stable/mpi.html#using-parallel-hdf5-…
-
I found https://github.com/reboot-dev/respect/pull/572 almost impossible to review because there is no documentation about what an `::eventuals::Closure` is. How does it differ from a `::eventuals::Th…