Problem at application time on a CUDA-enabled system

GateNLP / gate-lf-pytorch-json

PyTorch wrapper for the LearningFramework GATE plugin

Apache License 2.0

1 stars 2 forks source link

Problem at application time on a CUDA-enabled system #35

Closed johann-petrak closed 5 years ago

johann-petrak commented 5 years ago

When restoring the model, it seems to get moved automatically to CUDA at load time. Even moving the top module to the CPU does not seem to help - eventually the embeddings parameters are still on the GPU.

johann-petrak commented 5 years ago

OK, turns out the embeddings parameters are correctly on the CPU, but the batch indices are on the GPU.

johann-petrak commented 5 years ago

Ahhh - the problem is that the flag that caches the CUDA state gets saved with the module.

johann-petrak commented 5 years ago

OK, this can be fixed by overriding __getstate__ to avoid saving the transient variable and/or by overriding __setstate__ to initialise the transient variable on unpickling. Also, instead of moving everything back to the cpu after loading, we could try to re-map to the CPU (or desired GPU) already during load time using

torch.load('mymodule', map_location=lambda storage, location: 'cpu')

johann-petrak commented 5 years ago

Should implement the following changes:

add parameter to the wrapper load method to indicate cuda True/False
add getstate/setstate methods to deal with transients where necessary
make sure option False is correctly understood by argparse in apply.py as in train.py (argparse bug)

johann-petrak commented 5 years ago

Still missing the wrapper load parameter functionality

johann-petrak commented 5 years ago

Finally fixed as of a187946f4