Open zf0x00 opened 2 months ago
I am in the process of training a hypernetwork for Llama3!
nice ❤️ also can share info about training how much time it takes and i tried to train but most notebook doesn't support python 3.11
Here is the first version of a Llama3 hypernet: benjamin/zett-hypernetwork-Meta-Llama-3-8B-experimental.
It seems to underperform on Code though. I haven't yet found the reason why but will look into this later, so keeping this open.
Training took ~4 days on a TPUv4-32 pod.
Normal Llama 3 can work or need to train hypernetwork