Closed mikegao2020 closed 1 month ago
Please clarify where does the aggregation happen in the "layer" level. Otherwise, this is quite similar to MOE. Thanks.
Aggregation happens with the aggregator. This is different form MoE because we view each llm/agent abstractly as an "expert" rather than view FFN as experts
Please clarify where does the aggregation happen in the "layer" level. Otherwise, this is quite similar to MOE. Thanks.