Decide what generative pipeline to use, and try to implement it

The latter research papers^1 use the following generative scheme to reconstruct images:

Low-level (perceptual): maps brain signals to the embedding space of Stable Diffusion’s VAE. This embedding is then pushed through the VAE decoder block to produce a "low-level" reconstruction of the image.
High-level (semantic): maps brain signals to the CLIP image space. This embedding is then fed through pretrained img2img model with obtained "low-level" reconstructed image.

In this task, one has to refer to the SD VAE and img2img model in order to inference them in a jupyter notebook notebooks.

[!IMPORTANT] Please, pay attention to the dimensions of the VAE latents and CLIP embeddings. It is important to know them for a successful method implementation.

intsystems / CreationOfIntelligentSystems_Simultaneous_fMRI-EEG

Decide what generative pipeline to use, and try to implement it #11