jafioti / luminal

Deep learning at the speed of light.
https://luminalai.com
Apache License 2.0
1.45k stars 90 forks source link

Phi model does not produce output on M3 #55

Closed jorgeantonio21 closed 3 months ago

jorgeantonio21 commented 4 months ago

Currently, I can't extract an output by running the phi3 example:

 % cargo run --release --features metal

\    Finished release [optimized] target(s) in 0.27s
     Running `/Users/jorgeantonio/dev/luminal/target/release/phi`
Defining graph           - 75ms
Compiling graph          - 4799ms
Loading model            - 3544ms
Processing Prompt        - 183ms (71.04 tok/s, 13 prompt tokens)
<|user|>
Please write me a python implementation of merge sort<|end|>
<|assistant|>

Average token generated in 46.66ms       - (21.43 tok/s)
jorgeantonio21 commented 4 months ago

This issue is related to #51

jafioti commented 4 months ago

Does this still happen if you pull main branch? I believe for others this has been fixed. It may be the same issue with M3 that llama is facing

jafioti commented 4 months ago

I'm fairly certian the problem is the softmax kernel producing inf on your machine, which makes the logits come out NaN, and triggers the blank token to be outputted, which is why you see no output at all. I will be revisiting the softmax kernel today or tomorrow to fix this

jorgeantonio21 commented 4 months ago

I pulled the main branch right now, and the problem persists.

Thank you so much @jafioti !

mikeseven commented 3 months ago

yes comment SoftmaxCompiler in luminal_metal lib.rs and Phi (and Llama) example will work on M3

jafioti commented 3 months ago

@mikeseven Does it give proper outputs? In the other issue you mentioned it gives bad outputs

mikeseven commented 3 months ago

Sorry for the confusion. I wanted to say that the output looks correct but not as good as with llama. It looks to me a model accuracy issue.

jafioti commented 3 months ago

Ok I'll close this for now then, thanks