-
I experienced a distressing setback on January 5th, 2024, when I inadvertently clicked on an airdrop link in my Trust wallet, resulting in the theft of 10.7 Ethereum from my account. The shock and fru…
-
Because I am using vLLM server to deploy a MoE model. However, this model has a large number of experts and the number of activated experts is very small. So it is very suitable for the expert offload…
-
Hi there, thanks mergoo, an amazing code base for MoE model construction.
A crucial feature that may need to be implemented is that mergoo should let the user select the basic routing policy when c…
-
I like the three step innovative training approach to train the MLLMs. This intrigued me more and I was going through the scripts trying to replicate 3 step training technique to train my own model. …
-
I had a brief look, seems interesting. However it looks like it only generates expert+ difficulty? So I'll have to play around with the settings to match other difficulties. I'm already training a mod…
-
This feature will optimize updates from the CDL Elements system by only processing user relations that have changed since the last update.
The new experts-client function will compare modification …
-
```
assert not args.model_parallel.fp16, \
"Expert parallelism is not supported with fp16 training."
```
from https://github.com/NVIDIA/Megatron-LM/blob/db3a3f79d1cda60ea4b3db0ceffcf…
-
-
hi
-
Anthony Fauci