RomGai / BrainVis

Official code repository for the paper: "BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction"
https://brainvis-projectpage.github.io/
MIT License
43 stars 2 forks source link

Some Confuse on the training of TimeEnoder & FreEncoder #9

Closed ZiyiTsang closed 1 month ago

ZiyiTsang commented 1 month ago

Hi there, to begin with, your paper and code is really impresive to me. I have fully read your paper and now I am running on your code given. However, I have some comfusion to ask regarding to the self-training of timeEncoder in your code. image

  1. I don't really understand the idea of "tokenizer" part inside the timeEncoder (i.e.codebook-based semantic unit classification). thus could you kindly share the paper with us where the idea comes from?
  2. What is the reason to use moving average in regression? I guess this idea is coming from Moco, Moco use momentum encoder to do the comparative learning. but in your paper, seems you don't use the comparative learning, right? In my perspective of view, it's unnessary to use momentum encoder in your code. that is because, fm and fv should be in same representation space, thus I guess it is better to use universal transfoemr block to handle genetation of fm and fv (i.e. delete momentum encoder)

Hope you can clarify and thanks for your effort to our community.

RomGai commented 1 month ago

Thanks for your interests in our work.

  1. Please refer to the last link of the Broader Information section in README.md.

  2. Since the output of the transformer blocks for masked units is used as supervision for latent masked reconstruction, we hope that its output can stably represent signal features and should not fluctuate too much by just several updating steps during training. Therefore, the moving average is crucial as it allows the transformer blocks for masked units to be more influenced by historical gradients, leading to stable but continuously improved EEG feature representations over long pre-training periods.

ZiyiTsang commented 1 month ago

That make sense. Thanks for your answering and looking forward to your future works!

ZiyiTsang commented 1 month ago

I would like to ask one more question, when I am training the freEncoder, the test accuracy can only reach around 0.2-0.3, though it is much better than random classifier(0.05 for 40 class classification), but still very low. Is it normal situation? Will it affect the image generation later? image

RomGai commented 1 month ago

Yes. The frequency branch is just a complement to the features. The next stage is joint fine-tuning with the TimeEncoder, which produces the final EEG embedding. The accuracy will be different I think.

BTW, we will upload a new version of the paper as soon as possible to correct some issues in the current version on arXiv and optimize the experimental content. The issues you encountered were not highlighted as a focus of ablation analysis in the arXiv version, and we apologize for any misunderstanding this may have caused.

ZiyiTsang commented 1 month ago

Thanks for your kindly answering!