Closed Kafka2122 closed 3 months ago
Hi, as joint training is especially for spares to dense generation, we only do unconditional generation here
Hi, as joint training is especially for spares to dense generation, we only do unconditional generation here
Ok, so if unconditional generation works, shouldn't your model be able to do conditional generation as well? In paper it is mentioned that for conditional generation we just need to give partial codeblock indices and the transformer will predict the full scene
the paper emphasizes joint training of sparse encoder and dense VQ-VAE to optimize the codebook and improve generalization. But in this code joint training has not been done right? is there any reason for that?