Open GoldenFishes opened 1 year ago
Never mind,I got it.
Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”
I have the same question. How do you draw these figures?
Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”
I have the same question. How do you draw these figures?
Well, since you've asked, I'd like to know your research direction first.
Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”
I have the same question. How do you draw these figures?
Well, since you've asked, I'd like to know your research direction first.
Now, I am interested in image synthesis. I just know the way to draw the figure of amplitude/sampling frequency as follows:
Now, I am interested in image synthesis. I just know the way to draw the figure of amplitude/sampling frequency as follows:
So, you're not involved in research related to diffusion models?
Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”
I have the same question. How do you draw these figures?
import cv2 import numpy as np import matplotlib.pyplot as plt from matplotlib.cm import get_cmap
def calculate_relative_log_frequency(image):
fourier_transform = np.fft.fft2(image)
# Calculate the amplitude of the Fourier transform
amplitude = np.abs(fourier_transform)
# Calculate the relative log frequency
relative_log_frequency = np.log(amplitude + 1) - np.log(amplitude[0, 0] + 1) # Relative to the log frequency at 0 point
# Get the width of the image
image_width = image.shape[1]
# Calculate frequency values on the axis
frequency_values = np.fft.fftfreq(image_width)
# Keep only the part in the range 0.0 to 1.0
valid_indices = (frequency_values >= 0.0) & (frequency_values <= 1.0 / np.pi)
frequency_values = frequency_values[valid_indices]
relative_log_frequency = relative_log_frequency[:, valid_indices]
# Reduce data points to reduce oscillation
# Here, we add a step to take every 4th value
frequency_values = frequency_values[::4]
relative_log_frequency = relative_log_frequency[:, ::4]
# Rescale frequency values to the range 0.0 to 1.0
frequency_values = (frequency_values - min(frequency_values)) / (max(frequency_values) - min(frequency_values))
return relative_log_frequency[0], frequency_values
def plot_multiple_relative_log_frequencies(image_filenames, alpha=0.8): plt.figure(figsize=(12, 6)) cmap = get_cmap('Blues')
new_labels = ["step1", "step100", "step200",
"step300", "step400", "step500",
"step600", "step700", "step800",
"step900", "step1000"]
for i, filename in enumerate(image_filenames):
image = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
relative_log_frequency, frequency_values = calculate_relative_log_frequency(image)
k = i / len(image_filenames)
color = cmap(k + 0.2)
# Use the new labels
label = new_labels[i]
# Adjust the alpha parameter to control the transparency of the curve
plt.plot(frequency_values, relative_log_frequency, color=color, label=label, linewidth=2.5, alpha=alpha)
plt.xlabel('Frequency')
plt.ylabel('Relative Log Amplitude')
plt.xlim(0.0, 1.0)
plt.grid(True)
plt.legend()
# Modify x-axis labels
x_ticks = np.linspace(0, 1, 6)
x_tick_labels = [f'{x:.1f}π' for x in x_ticks]
plt.xticks(x_ticks, x_tick_labels)
plt.show()
image_filenames = ["0-eps1.0.png", "0-eps0.9.png", "0-eps0.8.png", "0-eps0.7.png", "0-eps0.6.png", "0-eps0.5.png", "0-eps0.4.png", "0-eps0.3.png", "0-eps0.2.png", "0-eps0.1.png", "0-eps0.0.png"]
plot_multiple_relative_log_frequencies(image_filenames)
I apologize for the formatting issue in the code. I hope it can help you.
I apologize for the formatting issue in the code. I hope it can help you.
Thanks for your reply!
I apologize for the formatting issue in the code. I hope it can help you.
The code you apply above is about how to draw the Fig3? Do you konw how to draw the Fig2? Hope you can help me.
The code you apply above is about how to draw the Fig3? Do you konw how to draw the Fig2? Hope you can help me.
I can do Fig2 in other model such as U-ViT in UniDiffuser, I haven't try it in StableDiffusion U-Net.
The code you apply above is about how to draw the Fig3? Do you konw how to draw the Fig2? Hope you can help me.
I can do Fig2 in other model such as U-ViT in UniDiffuser, I haven't try it in StableDiffusion U-Net.
Could you share the code with me about how to draw the Fig2 in U-ViT? Is it use the low or high filters to achieve it?
Could you share the code with me about how to draw the Fig2 in U-ViT? Is it use the low or high filters to achieve it?
The method for generating intermediate results in the denoising process can be found in my open-source code. Once you obtain these intermediate results, separating the high and low frequencies becomes much easier. https://github.com/GoldenFishes/FreeV/issues/1
Does this mean we can just get the intermediate noise map and decode it back to a noisy image and then analyse it with fft?
这是否意味着我们可以获取中间噪声图并将其解码回有噪声的图像,然后使用 fft 进行分析?
Exactly!That‘s how it works. You can see https://github.com/GoldenFishes/FreeV/issues/1 , how did i visualize the denoising intermediate results.
这是否意味着我们可以获取中间噪声图并将其解码回有噪声的图像,然后使用 fft 进行分析?
Exactly!That‘s how it works. You can see GoldenFishes/FreeV#1 , how did i visualize the denoising intermediate results.
Thank you for your answer! This question has been bothering me for a long time. However, why did the author decode latent encoding to pixel space to do Fourier analysis instead of doing it directly to latent encoding? I think this question is very interesting. Could you answer it for me? Thanks a lot!!!!
这是否意味着我们可以获取中间噪声图并将其解码回有噪声的图像,然后使用 fft 进行分析?
Exactly!That‘s how it works. You can see GoldenFishes/FreeV#1 , how did i visualize the denoising intermediate results.
Thank you for your answer! This question has been bothering me for a long time. However, why did the author decode latent encoding to pixel space to do Fourier analysis instead of doing it directly to latent encoding? I think this question is very interesting. Could you answer it for me? Thanks a lot!!!!
I don't quite understand your question. As far as I know, the author did not do this. The author's only purpose is to study the high and low frequency changes of features during the denoising process in the Fourier domain. And to achieve this goal, the specific method is to perform Fourier analysis on the features at every time step of the denoising process. This feature is most directly obtained in the latent space (because the diffusion model is trained in this space). In my understanding, the author directly performs Fourier analysis on this feature without any additional conversion. This feature map is essentially another form of image, and Fourier analysis does not require mapping it to the pixel space as a reason.
这是否意味着我们可以获取中间噪声图并将其解码回有噪声的图像,然后使用 fft 进行分析?
Exactly!That‘s how it works. You can see GoldenFishes/FreeV#1 , how did i visualize the denoising intermediate results.
Thank you for your answer! This question has been bothering me for a long time. However, why did the author decode latent encoding to pixel space to do Fourier analysis instead of doing it directly to latent encoding? I think this question is very interesting. Could you answer it for me? Thanks a lot!!!!
I don't quite understand your question. As far as I know, the author did not do this. The author's only purpose is to study the high and low frequency changes of features during the denoising process in the Fourier domain. And to achieve this goal, the specific method is to perform Fourier analysis on the features at every time step of the denoising process. This feature is most directly obtained in the latent space (because the diffusion model is trained in this space). In my understanding, the author directly performs Fourier analysis on this feature without any additional conversion. This feature map is essentially another form of image, and Fourier analysis does not require mapping it to the pixel space as a reason.
哦,我懂了,我是看到这句话有点奇怪
Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”