[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.
Hello! I have a question: since the diffusion model generates sentences starting from random noise, the generated sentences should reflect diversity. Have you conducted any experience about diversity?
Hello! I have a question: since the diffusion model generates sentences starting from random noise, the generated sentences should reflect diversity. Have you conducted any experience about diversity?