mlpc-ucsd / Patch-DM

Code Release for Patch-DM (ICLR 2024)
36 stars 1 forks source link

Border artifacts #3

Open Tianyangg opened 10 months ago

Tianyangg commented 10 months ago

Hello, thank you for presenting your work.I have a question about border artifacts during training.

zh-ding commented 10 months ago

Hi, thanks for the questions.

  1. Yes, that's correct. During the training the border artifacts will gradually decrease.
  2. Sorry I may not fully understand your question. Is the image collage output (output nonshift) you mean here the output that does not use the feature collage mechanism?
Tianyangg commented 10 months ago

Thanks for your reply XD So return contains two outputs, pred1 is the output with feature collage, and pred2 is the direct output without feature collage (from my understand) so ideally shoud the pred2 stitch in imagespace (and take the center 64 * 64) be similar to the pred 1 output ideally? image

zh-ding commented 10 months ago

Thanks for the clarification. Yes we don't have specific constraints on these two but as they're both used to calculate the loss with the groundtruth so there are indirect constraints applied on them. The main point here is to let the decoder fuse the collaged features from the encoder to alleviate the border artifacts.

oscarwooberry commented 7 months ago

Hello, thank you for presenting your work.I have a question about border artifacts during training.

  • My understanding is the grid-like artifact will gradually gets better during training iterations, but still remain obvious in early training stage, is that the case during your expreiment?
  • And do you think it helps to constrain the consistency between feature collagre output (output shift) and image collage output (otuput nonshift)

May I ask how long took you to remove the artifacts? I was training for days and the grid artifact still remains.

Tianyangg commented 7 months ago

Hello, thank you for presenting your work.I have a question about border artifacts during training.

  • My understanding is the grid-like artifact will gradually gets better during training iterations, but still remain obvious in early training stage, is that the case during your expreiment?
  • And do you think it helps to constrain the consistency between feature collagre output (output shift) and image collage output (otuput nonshift)

May I ask how long took you to remove the artifacts? I was training for days and the grid artifact still remains.

Personally I'm working with 3D images, which requires lots of days (always more than a week) to get images with good quality, and from my experiements i would say the global condition is the key for removing boarder artifacts, do you have that in your experiments?

oscarwooberry commented 7 months ago

Hello, thank you for presenting your work.I have a question about border artifacts during training.

  • My understanding is the grid-like artifact will gradually gets better during training iterations, but still remain obvious in early training stage, is that the case during your expreiment?
  • And do you think it helps to constrain the consistency between feature collagre output (output shift) and image collage output (otuput nonshift)

May I ask how long took you to remove the artifacts? I was training for days and the grid artifact still remains.

Personally I'm working with 3D images, which requires lots of days (always more than a week) to get images with good quality, and from my experiements i would say the global condition is the key for removing boarder artifacts, do you have that in your experiments?

It took me more than 1 week but still won't produce meaningful results, the grid still exists and the output images were hard to recognize. and yes I used the clip as global condition as the instruction suggested(initialize.py), with that, we can set semantic_enc false right? Another potential problem is that my image is high resolution for training(1k+) and the dataset is relatively small.

zh-ding commented 7 months ago

I'm not sure what's the exact dataset you're using. In our experiments on the natural images datasets (contains around 20,000 images), we train for around 2 weeks on a 8xA6000 machine. The CLIP embeddings might not be a good initialization if your dataset is OOD for CLIP. I think you can try decreasing the resolution a bit or increasing the patch resolution if your resolution is too high now.