-
I mentioned here https://github.com/chainer/chainercv/issues/735#issuecomment-479616802_ before and currently FPN detector is depending on the chainermn. Unfortunately chainermn is not easy to install…
-
```shellsession
$ nvidia-docker run -v $(pwd):/mnt mazgi/cuda-cv:9.0-cudnn7-devel-ubuntu16.04 mpiexec --allow-run-as-root -n 2 python3 /mnt/chainermn/examples/mnist/train_mnist.py --gpu -1
---------…
mazgi updated
2 years ago
-
You might know this already, recently I tried ChainerMN on [Sakura Koukaryoku Computing](https://www.sakura.ad.jp/koukaryoku/).
I measured processing throughput by ImageNet example and compared [Ch…
-
At least we know with FP16 model Communicator's `bcast_data` does not work.
```diff
diff --git a/tests/chainermn_tests/communicator_tests/test_communicator.py b/tests/chainermn_tests/communicator_…
-
ChainerMN has mostly-copied BatchNormalization code (but several AllReduce added), which means potential bugs from Chainer could also be imported. https://github.com/chainer/chainer/pull/4191 could be…
-
In your example [here](https://github.com/aws/sagemaker-chainer-container) for building the final container, you execute this command `docker build -t preprod-chainer:4.1.0-gpu-py3 -f docker/4.1.0/fin…
vangj updated
5 years ago
-
## What happened
`_DistributedSnapshot` with `BestValueTrigger` gets stuck.
### code
https://gist.github.com/dhgrs/56424106e00bafee9617b0a15a028c2c
### command
`CUDA_VISIBLE_DEVICES=0,1 mpiex…
dhgrs updated
4 years ago
-
Hi, I got following error when I tried to train PSP net with your train_mn.py
How can I train my PSPNet model?
```
root@5e6e3385ca5a:/yendo/oss/chainer-pspnet# python3 train_mn.py --result_dir re…
-
This issue is not inherent to chainermn, so I was confused where to submit it.
In the [training example of ImageNet](https://github.com/chainer/chainermn/blob/master/examples/imagenet/train_imagenet.…
-
Deep learning frameworks support multi-process data loading, such as `num_worker` option of `DataLoader` in PyTorch, `MultiprocessIterator` in Chainer, etc.
They use multiprocessing module to launch …