Open oobabooga opened 1 year ago
If this were to be added and was usable on Colab then you could load up to like 13B models on a standard GPU. That way you wouldn't need to use TPUs to run bigger models since well they don't work atm. (using oobabooga's colab won't work on standard GPUs since it loads up the shards to the RAM and it would run out of memory but KoboldAI shouldn't have that problem)
See https://gist.github.com/whjms/2505ef082a656e7a80a3f663c16f4277