FIX: Prevent CUDA context initialization due to AWQ

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Apache License 2.0

16.57k stars 1.64k forks source link

Importing from AWQ triggers CUDA context initialization, which can be problematic in some circumstances (see #1877). This PR moves the import so that it's local, preventing this issue.

To test this, run this script:

import multiprocessing

from torch import nn
import peft  # noqa F401

def func():
    nn.Linear(2, 3).cuda(0)

if __name__ == "__main__":
    proc = multiprocessing.Process(target=func)
    proc.start()
    proc.join()

Ideally, we can add this to our nightly GPU tests, but running this from the pytest runner does not work (IIRC) so some extra steps are required.

huggingface / peft

FIX: Prevent CUDA context initialization due to AWQ #2230