Closed linbingkong closed 10 months ago
Hi, thanks for the question. We mention this in appendix A.1.5
Thank you for your reply.appendix A.1.5 explain throught the patching operator to generate a 4D tensor of size B ×M ×P ×N which represents a batch of x(i)p ∈RP ×N in M series,this means the Channel Independence?
Hi, thanks for asking this. Actually here we make it 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. This will treat every M independently as a sample from the batch B. So (B ·M ) is the equivalent "batch size". This is how we implement channel-independence.
Hi, thanks for asking this. Actually here we make it 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. This will treat every M independently as a sample from the batch B. So (B ·M ) is the equivalent "batch size". This is how we implement channel-independence.
Thank you for your answer
According to some closed issues and debug by myself,I understand channel independence as processing each feature before inputting it into encode, such as reshaping the 4D tensor from B ×M ×P ×N to (B ·M )×P ×N. Am I right?Or can you explain in detail how channel independence is implemented at the code level?