Closed boricuapab closed 1 year ago
That story isn't that shit. Falcon is quite interesting in what it can generate. With a larger prompt the story could even get some twists. I will add your link to the readme until it's outdated.
Note: with the latest release you'll see a huge increase in performance for such long generations, likely 2 times faster at 600 tokens
This isn't an issue or enhancement request.
Just wanted to say thanks for your work on ggllmcpp,
And just wanted to help Windows users, that don't want to go the wsl route, be able to get it working using gpu offloading which after many tries and research the only solution I found was a bit tricky to figure out which I show in this video.
https://www.youtube.com/watch?v=BALw669Qeyw
Also these are my pc specs:
CPU = AMD Ryzen 7 3700X 8-core Processor RAM = 32gb GPU = RTX 2060 Super 8gb
Here are some of my results:
CPU Only
With GPU Offloading