A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.
Apache License 2.0
114
stars
12
forks
source link
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu #5