GateNLP / gate-lf-pytorch-json

PyTorch wrapper for the LearningFramework GATE plugin
Apache License 2.0
1 stars 2 forks source link

Setting seed does not allow to exactly reproduce results #34

Closed johann-petrak closed 5 years ago

johann-petrak commented 5 years ago

The seed is used or should get used for shuffling the dataset and random weight initialisation need to check where setting the seed does not have the proper impact.

johann-petrak commented 5 years ago

Testing this with the gate-lf-tests/cl-sentclass/model-pytorch-multifeat1-l dataset and --seed 123: ./train.sh --seed 123 data/crvd.meta.json data/model

The converted train and val datasets are identical between runs so the data shuffling is probably not the cause.

johann-petrak commented 5 years ago

According to https://pytorch.org/docs/stable/notes/randomness.html the torch.manual_seed() method should set both the CPU and CUDA RNG seeds.

Retry after making sure we set all global RNGs before each training step AND we set the global RNGs in the dataset so that random embeddings are created from the numpy seed set properly:

Run on CPU:

So this seems to be fixed: with the CPU we get full repeatability, with CUDA not quite.

johann-petrak commented 5 years ago

Try setting the cudnn backend mode to deterministic.

OK, this is fixed now!