Open yieldthought opened 1 week ago
10/2 update:
What's next:
Llama3.2-11B-Vision bringup
Text model
Vision model
To run new tests, I need to figure out how to share llama-models
changes. You also have to install some new packages.
pip install -r ../llama-models/requirements.txt
No issues
Has bias, uses GELU as activation. Only two linears.
Very similar to Attention, but does not generate a cache! It's MHA.
Not a great shape, though: ImageAttention: dim=1280, head_dim=80, n_heads=16
Also requires an attention mask, which means we need to support non-causal attention in SDPA.
Meta does something strange with qkvo replication which I don't understand https://github.com/meta-llama/llama-models/blob/main/models/llama3/reference_impl/multimodal/model.py#L254
Bring up Llama 3.2 model family on Wormhole, T3K and TG