Open iuybw opened 1 month ago
Thank you for your question! All the experts in our model share its architecture (including TIM) except spatial dependency modeling methods. TIM and time-enhanced attention are proposed to improve temporal domain modeling, therefore, it is used by all the experts.
Thank you for your open source and inspiring paper. I have a question that I hope you can answer, why is it not necessary to use TIM for history timesteps in the other two experts adaptive_expert and attention_expert in the same way as identity_expert?