Encoder - Githubissues

DarthReca commented 2 years ago

Link a survey su vari modelli: https://arxiv.org/abs/2205.10587

giuliadascenzi commented 2 years ago

Analizzare le performances quando al classico componente encoder-decoder viene sostituito un modello UNet ( aggiungendo connessioni tra layers).

Da: Image-to-Image Translation with Conditional Adversarial Networks :

Many previous solutions to problems in this area have used an encoder-decoder network. In such a network, the input is passed through a series of layers that progressively downsample, until a bottleneck layer, at which point the process is reversed (Figure 3). Such a network requires that all information flow pass through all the layers, including the bottleneck. For many image translation problems, there is a great deal of low-level information shared between the input and output, and it would be desirable to shuttle this information directly across the net. For example, in the case of image colorizaton, the input and output share the location of prominent edges. To give the generator a means to circumvent the bottleneck for information like this, we add skip connections, following the general shape of a “U-Net” [34] (Figure 3). Specifically, we add skip connections between each layer i and layer n − i, where n is the total number of layers. Each skip connection simply concatenates all channels at layer i with those at layer n − i.

Notare anche implementazione successiva ad AttGan (con architettura simile), che modifica la struttura encoder-decoder aggiungendo le connessioni tra layers: Adversarially Regularized U-Net-based GANs for Facial Attribute Modification and Generation

Nominato anche in questo survey: Survey Paper on Person Attribute Changes using GANS

Anche in questo modello, sostituiscono l'enc-dec con l'UNet: MU-GAN: Facial Attribute Editing based on Multi-attention mechanism

giuliadascenzi commented 2 years ago

Possibili implementazioni:

patriziodegirolamo commented 2 years ago

si potrebbe analizzare come l'attention riesce ad aumentare le prestazioni. In particolare, usando l'attention nell'architettura encoder + decoder , si riescono ad ottenere risultati migliori in quanto si pone l'attenzione solo sui dettagli che vogliamo cambiare. il loro paper mi sembra il più chiaro che ho letto e ci sono anche immagini che ti fanno capire meglio tutto il processo. PA-GAN: https://arxiv.org/abs/2007.05892 PDF: https://arxiv.org/pdf/2007.05892.pdf codice: https://github.com/LynnHo/PA-GAN-Tensorflow

DarthReca / mlinapp-project

Encoder #7