microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
723 stars 93 forks source link

Error:Exception: MoE JIT is designed to work on sample size = 800, while receiving sample size = 1600 (> 800) #96

Open Satan012 opened 2 years ago

ghostplant commented 2 years ago

It is expected to set an upper bound of sample size when creating tutel.moe_layer(..). If not set by default, the upper bound will be automatically set to be the batch size from the first time of data forwarding.

To set the upper bound manual in your case, please add this line after the creation of tutel.moe_layer(..):

class MyModel(Module):

  def __init__(self):
    ...
    self._moe_layer = tutel_moe.moe_layer(..)
    self._moe_layer.expected_sample_size = 1600  # Add this line to set the upper bound manually
    ...
ghostplant commented 2 years ago

In v0.1.5, expected_sample_size will be automatic decided by MoE Layer, so that you no longer need to manually set self._moe_layer.expected_sample_size = 1600.