Questions about the number of model parameters

nevoliu commented 1 year ago

Thank you very much for your contribution, which has been a great help to me.

When I implemented your work, I found that the number of parameters for FEDformer is 16303127, but the number of parameters for the comparison models Autoformer, Informer, and Transformer is only 1000 0000+. Does this lose the point of comparison.

tianzhou2011 commented 1 year ago

I would say yes and no. The dataset used in this study is considerably smaller compared to NLP or CV datasets. Adding a larger model will not automatically give better results. In fact, if you stack more encoder layers for those transformer models such as Transformer, Autoformer, Fedformer, etc., it will lead to worse results. This is the reason why we haven't seen any time series paper with similar small, median, and large model settings as language models. However, if you have a very large and complex time series dataset in hand, like those used in language models, and try to train a general pre-train model, such concerns might make sense.

Tian

On Tue, Mar 28, 2023 at 1:34 PM nevo liu @.***> wrote:

Thank you very much for your contribution, which has been a great help to me.

When I implemented your work, I found that the number of parameters for FEDformer is 100772687, but the number of parameters for the comparison models Autoformer, Informer, and Transformer is only 7000+. Does this lose the point of comparison.

— Reply to this email directly, view it on GitHub https://github.com/MAZiqing/FEDformer/issues/49, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3JGO6AW456GI2H3FJ24L3W6JZ7XANCNFSM6AAAAAAWKBWRPI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

nevoliu commented 1 year ago

I would say yes and no. The dataset used in this study is considerably smaller compared to NLP or CV datasets. Adding a larger model will not automatically give better results. In fact, if you stack more encoder layers for those transformer models such as Transformer, Autoformer, Fedformer, etc., it will lead to worse results. This is the reason why we haven't seen any time series paper with similar small, median, and large model settings as language models. However, if you have a very large and complex time series dataset in hand, like those used in language models, and try to train a general pre-train model, such concerns might make sense. Tian … On Tue, Mar 28, 2023 at 1:34 PM nevo liu @.> wrote: Thank you very much for your contribution, which has been a great help to me. When I implemented your work, I found that the number of parameters for FEDformer is 100772687, but the number of parameters for the comparison models Autoformer, Informer, and Transformer is only 7000+. Does this lose the point of comparison. — Reply to this email directly, view it on GitHub <#49>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3JGO6AW456GI2H3FJ24L3W6JZ7XANCNFSM6AAAAAAWKBWRPI . You are receiving this because you are subscribed to this thread.Message ID: @.>

Thank you very much for your reply, it solved my problem.

MAZiqing / FEDformer

Questions about the number of model parameters #49