RomGai / BrainVis

Official code repository for the paper: "BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction"
https://brainvis-projectpage.github.io/
MIT License
44 stars 3 forks source link

RuntimeError: Tensors must have same number of dimensions: got 2 and 3 #3

Open OKWELLHELLO opened 7 months ago

OKWELLHELLO commented 7 months ago
1713766979778

so,what should i do?

RomGai commented 7 months ago

Thanks for your interest in our work. I check the code and retrain the freq_encoder. When the batch_size is 1, I get a vector of size (1,128) instead of (1, 220,128). Please confirm that the code in "BrainVisModels.py" you are using is still consistent with what is provided on Github. The second output of the freq model, "freq_feature", should match the result of "F.relu(self.output(x))" in line 268 of the original "BrainVisModels.py" code.

Please consider replacing the "FreqEncoder" in your current "BrainVisModels.py" with the original code from the Github repository and trying again. I think there might not be a need to retrain, it could just be an issue with the output.

If the problem still cannot be resolved, please send the current "BrainVisModels.py" you are using to me via email.

OKWELLHELLO commented 7 months ago

Thank you for your solution. This problem has been solved, but when I tried to generate image, I found another problem. The image generated is very strange. I don't know what's going on. Do I need to retrain all the models?

0e10f89f9ae9bae2f7d02ff49646247 e294ceebdca323f8391c6729014a1d3

like this

RomGai commented 7 months ago

This seems to be related to the training of the model, and I offer several possibilities as follows:

1.The training batch_size is too small, or other hyperparameters are not set correctly, which causes the model to not learn the EEG features or align with CLIP well. You can check if the configs in your "args.py" are consistent with the code on Github.

2.Since the pipeline is trained in steps, please check if every step has been correctly executed, such as if there are any modules that are not trained, or stop early (for example, the Alignment Module). Refer to the step 4, 5 and 6 of "Train the model" section in "README.md", ensuring that both “trainer.finetune_timefreq()” and “trainer.finetune_CLIP()” have been correctly executed and the model is successfully optimized.

3.There may have some changes to your local model, please ensure your model is consistent with "BrainVisModels.py" on Github.

OKWELLHELLO commented 7 months ago

I tried to download the new code and retrained everything according to the readme, but the results were the same as before

OKWELLHELLO commented 7 months ago

I don't know which step I went wrong 1714213833635

RomGai commented 7 months ago

Try to use the models named “clipfinetune_model_epoch_xxxxxxxx" when you load trained model in the 193 line of "cascade_diffusion.py".

RomGai commented 7 months ago

If you find it inconvenient to make the changes yourself, I've modified the default settings in "process.py" and "cascade_diffusion.py". Please download them again. Execute finetune_CLIP() for 200 epochs in "main.py", and then run "cascade_diffusion.py" once more.

I'm sorry for the differences between our code's default settings and those described in the paper. The default settings save a wide variety of models, including those used in the paper, which might require manual configuration after you've read the paper when generating images. Please understand us that the broader model saving is intended to facilitate broader exploration.

OKWELLHELLO commented 7 months ago

Thanks for your reply. I will download the models of clip and BLIP2 again, and then train BrainVis models all again to see if they can solve the problem

OKWELLHELLO commented 7 months ago

If the result is wrong again, I will download the new code and try again

RomGai commented 7 months ago

I think it may not be the issue of clip and BLIP2, so I delete my previews reply. You can try the new code first, it is much easier, and I think it will work.

RomGai commented 7 months ago

Just need to replace the old "process.py" and "cascade_diffusion.py", and execute finetune_CLIP() for 200 epochs in "main.py", then run "cascade_diffusion.py" once more.

OKWELLHELLO commented 7 months ago

3bdf9867737589e70313285f73171fa 764c05767e2be5cddeaa6de1370739e 42e8bbc8de76616308852c95a61a309

The situation is a bit better than before, but I still feel that the generated images are very strange

OKWELLHELLO commented 7 months ago

The images generated by the model also have significant differences from the real images of the corresponding category.

RomGai commented 7 months ago

I've uploaded the new “cascade_diffusion.py”. Indeed, there were numerical bugs in the previous version. Thank you for your feedback. Current version is ok.

Additionally, category errors are possible, as described in the paper regarding classification accuracy, your results should be similar to your classification accuracy. This needs to be evaluated across the entire test set. You can generate more images and observe them. Correct and incorrect results may not be uniformly distributed.

OKWELLHELLO commented 7 months ago

Thank you for providing the solution. Currently, the model is running normally, but I believe that using the 'clipfinetune_model. pkl' model will achieve better results than using 'clipfinetune_model_epoch200. pkl'. In addition, I hope the author can put the complete virtual environment including the version of the package into env.yaml, because I have various version mismatch issues during the installation of the package, so I need to install many versions for a certain package to see if it is compatible with other packages. This is really troublesome.

OKWELLHELLO commented 7 months ago

Your research is very meaningful, looking forward to your next work.

RomGai commented 7 months ago

Thank you for your advice and recognition of our work. Everyone in the community makes it better. I will confirm and update the available package version information as soon as possible.