ArgLab / ArgLab_writing_observer

Writing Observer and Learning Observer: A system for monitoring learning process data, with an initial focus on writing process data from Google Docs.
GNU Affero General Public License v3.0
3 stars 2 forks source link

cupy/numpy conversion #46

Open DrLynch opened 1 year ago

DrLynch commented 1 year ago

Attempting to force or even encourage use of the GPU in spacy triggers passive errors in the AWE code which arise from type conversions between numpy and cupy. This can be triggered by adding spacy.prefer_gpu() to the initial NLP call in awe_nlp then observing the type conversion errors.

In theory cupy offers straightforward replacements for numpy and so we could potentially do a drop-in replacement and test it but that would need some more extensive verification. We also may need to consider whether we refactor the code to have wrappers for gpu and cpu operation since it may still be necessary to run it that way.

In the initial step we need to isolate the cases that trigger this to determine how often they arise and where.

DrLynch commented 1 year ago

Paired with this we need to consider how to distribute processes across GPUs.

DrLynch commented 1 year ago

One ideal resolution to this issue will be to find a way to make the system operate seamlessly across both platforms. This will require some research on whether Spacy and the Cuda data structures tolerate non-GPU operation and whether they can operate as efficiently. Failing that we may have to commit entirely to GPU operation, or develop a parallel version.