-
Occurred in #8420
https://travis-ci.org/chainer/chainer/jobs/610809007?utm_medium=notification&utm_source=github_status
```
[2019-11-12 11:34:58] > sock.connect(sa)
[2019-11-12 …
-
Reproduced both with and without CuPy and a useable GPU.
```
crissman@GPCL-GPU102:~/chainer/docs$ make html
sphinx-build -b html -d build/doctrees -W source build/html
Running Sphinx v1.8.3
ma…
-
Related to https://github.com/chainer/chainer/pull/4510 and https://github.com/chainer/chainer/issues/4582. It looks the following functions and links of ChainerMN should be fixed somehow to support f…
shu65 updated
5 years ago
-
Related: #5418
Chainer currently does not have a unified policy for logging.
-
`gc_interval` seems to be never used in `_MultiNodeCheckpointer` and to have no effect.
`cp_interval` is used twice. Possibly one of them should be `gc_interval` instead?
One more thing I wonder i…
nai62 updated
5 years ago
-
Major backward in current multi-node checkpointing system is that it takes snapshots of literally all replicas across the job. It's waste of disk space, but it's not clear whether just taking snapshot…
-
We found that `import cupy` in `pytest` can fail with `AttributeError` when the root directory is the Chainer repository:
```
$ cat importcupy.py
import cupy
$ python -m pytest importcupy.py
...
…
-
Hello, team!
I encountered a trivial issue that the ImageNet data parallel example (non-ChainerMN example; https://github.com/chainer/chainer/blob/v6.3.0/examples/imagenet/train_imagenet_data_paral…
-
Chainer has 'EarlyStoppingTrigger' to support early stopping.
It is not able to be used when ChainerMN is used.
But, the support for it in ChainerMN may be difficult, because the additional commun…
shu65 updated
5 years ago
-
Hello there:
I am trying to run some examples using chainer on a multinode multigpu server and have experimented some problems. To test chainermn in the server, I am using example/mnist/train_mnis…