Open yorickvP opened 2 weeks ago
@yorickvP this is great, making 8-bit inference work regardless of hardware is useful. That said, when I run cog predict -i prompt=<whatever>
on an A40, it takes ~10 minutes to compile and then there's no output. can you take a look? wary of pushing a broken path here
Tested on RTX A5000.