microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
694 stars 84 forks source link

about compute_location and locations #201

Open adverbial03 opened 1 year ago

adverbial03 commented 1 year ago

Thanks for your excellent work of tutel. I would like to know the function's function(at fast_dispatch.py

def compute_sorted_location(x, importance_scores):
    sorted_x = x[importance_scores.argsort(dim=0)]
    sorted_cumsum = fast_cumsum_sub_one(sorted_x) * sorted_x
    return sorted_cumsum[importance_scores.argsort(dim=0).argsort(dim=0)]

and the meaning of the parameterslocations_s which is return value of the function extract_critical(at fast_dispatch.py too)

ghostplant commented 1 year ago

It stores a list of unique index destinations that input tokens are to be written on for the following dispatching.