-
We've observed performance regression between 5.2 and 6.0 in our internal cluster at `allreduce_grad` with fp32 model. Benchmark code: https://github.com/chainer/comm_bench
6.0
![スクリーンショット_2019-0…
-
The rationale behind https://github.com/chainer/chainercv/pull/868 is
- Current `scatter_dataset` implicitly appends duplicate data for odd number of test dataset to communicatgor size
- Some eval…
-
We noticed nccl throws segmentation fault when doing massive reduce, during the tests with MPI 2x2.
**Error message**
`chainermn_tests/communicator_tests/test_communicator.py::TestDifferentDtype::te…
-
Hi,
I'm trying to run [train_mnist.py](https://github.com/chainer/chainermn/blob/master/examples/mnist/train_mnist.py), with multiple GPUs, but training hangs indefinitely at this point:
`mpirun…
-
Chainer: 6.0.0
The reproduction code is below:
```
import chainer
import chainermn
import chainer.functions as F
import chainer.links as L
comm = chainermn.create_communicator('naive')
net =…
F-Tag updated
5 years ago
-
There are many flaky tests, and leaving them as is not healthy for Chainer repository. I'd like to suggest they should be fixed. The following is the list of flaky tests (I may update this later).
…
-
Can we use NCCL with python code?
Thanks
-
It seems that Chainer does not play well with multiprocessing. Just a very simple example that when I use multiprocessing and can't backprop.
```
import multiprocessing
import numpy as np
import…
-
When using the following model and `create_mnbn_model()` in mnist example, i got an error.
model:
```
class BNMLP(chainer.Sequential):
def __init__(self, n_units, n_out):
super().…
shu65 updated
5 years ago
-
I want to use `mpi4py allreduce` in ChainerX ndarray in order to support use of MultiNodeEvaluator in ChainerX network model.
However, ChainerX doesn't support buffer protocol so mpi4py couldn't hand…