nicetomeetu21 / CA-LDM

Official implementation of the paper "Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models".
2 stars 0 forks source link

multi-slice decoder output #1

Open wuyouliaoxi opened 3 weeks ago

wuyouliaoxi commented 3 weeks ago

Dear respective author,

Thank you for releasing the wonderful codes. Here, I just have one confusion: In paper, "employ a multi-slice decoder that takes k consecutive latent slices as input and outputs a center high-resolution image slice of size (H, W )" But in code: https://github.com/nicetomeetu21/CA-LDM/blob/91e011153b8e79841f9c7388b56155d3b176084a/train_NHVQVAE.py#L222 the decoder seems to output a matrix with the shape of (k, 1, H, W) but rather a center slice with the shape of (1, 1, H, W) as well as in 3D adaptor. Do I make some mistakes? Could you explain about it?

wuyouliaoxi commented 3 weeks ago

True, I found that the generation is slice by slice during inference.
https://github.com/nicetomeetu21/CA-LDM/blob/91e011153b8e79841f9c7388b56155d3b176084a/test_decodebyMSDecoder.py#L68C1-L81C47 However, why is the resulting shape (400, 600, 400) instead of (512,512,512)?

nicetomeetu21 commented 2 weeks ago

Thanks for your comment. We train the NHVAE using 2D decoder, and seperately train the multi-slice decoder with trainable adaptors. when inference, we just using MS decoder. The resulting shape is a mistake, since we train numerous versions of model during designing. We think that this does not affect the logic of the algorithm and is easy to correct. We are pleased to resolve any other questions or errors.