tjake / Jlama

Jlama is a modern LLM inference engine for Java
Apache License 2.0
669 stars 62 forks source link

Add TornadoVM to speed up inferencing #117

Closed kevintanhongann closed 2 days ago

tjake commented 2 days ago

I've looked into this and it just isn't a mature enough solution. especially when considering de-quantization. I'm exploring the idea in #73