yuqinie98 / PatchTST

An offical implementation of PatchTST: "A Time Series is Worth 64 Words: Long-term Forecasting with Transformers." (ICLR 2023) https://arxiv.org/abs/2211.14730
Apache License 2.0
1.37k stars 248 forks source link

Implementation Issues Regarding Channel Independence #72

Closed linbingkong closed 10 months ago

linbingkong commented 10 months ago

According to some closed issues and debug by myself,I understand channel independence as processing each feature before inputting it into encode, such as reshaping the 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. Am I right?Or can you explain in detail how channel independence is implemented at the code level?

yuqinie98 commented 10 months ago

Hi, thanks for the question. We mention this in appendix A.1.5

linbingkong commented 10 months ago

Thank you for your reply.appendix A.1.5 explain throught the patching operator to generate a 4D tensor of size B ×M ×P ×N which represents a batch of x(i)p ∈RP ×N in M series,this means the Channel Independence? image

yuqinie98 commented 10 months ago

Hi, thanks for asking this. Actually here we make it 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. This will treat every M independently as a sample from the batch B. So (B ·M ) is the equivalent "batch size". This is how we implement channel-independence.

linbingkong commented 10 months ago

Hi, thanks for asking this. Actually here we make it 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. This will treat every M independently as a sample from the batch B. So (B ·M ) is the equivalent "batch size". This is how we implement channel-independence.

Thank you for your answer