mukel / llama3.java

Practical Llama 3 inference in Java
MIT License
514 stars 61 forks source link

why is it so slow? #3

Closed linghushaoxia closed 4 months ago

linghushaoxia commented 4 months ago

The work is excellent ! I run it on win10, load model cost 2131ms, then it seems to die this is the task monitor, and cmd screenshot 图片

图片

thanks!

mukel commented 4 months ago

Which hardware do you have (CPU, RAM...) and which java -version?

mukel commented 4 months ago

Please ensure that you run it using a vanilla OpenJDK that supports Java's Vector API.

linghushaoxia commented 4 months ago

Which hardware do you have (CPU, RAM...) and which java -version? 图片 @mukel

linghushaoxia commented 4 months ago

Which hardware do you have (CPU, RAM...) and which java -version? CPU Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz 2.30 GHz RAM 16.0 GB jdk support Java's Vector API @mukel

mukel commented 4 months ago

Java's Vector API is still in preview and not supported yet by GraalVM (it runs but it's not as fast as it should be). Please run with a vanilla OpenJDK and report the tokens/s.

linghushaoxia commented 4 months ago

Java's Vector API is still in preview and not supported yet by GraalVM (it runs but it's not as fast as it should be). Please run with a vanilla OpenJDK and report the tokens/s. 图片 change to a vanilla OpenJDK 1.46 tokens/s it works!