Unity-Technologies / barracuda-release

Other
564 stars 76 forks source link

[Feature Request] Memory Offloading #314

Open elephantpanda opened 1 year ago

elephantpanda commented 1 year ago

As per this: Flex Gen and this: Big Models

This is a way to run really large models by splitting the model up into small pieces and only putting a piece of the model on the GPU at one time.

This would be a very useful thing for Barracuda to implement especially if we want it to work on lower end hardware.