Closed 5uperpalo closed 2 years ago
another issue by dividing kernel window with max value, eg. here: https://github.com/YyzHarry/imbalanced-regression/blob/055a7b3804bbaf903ed25a55c11ab8acc6e142e1/agedb-dir/fds.py#L44 you are changing the mean and variance values and not only smoothing them along axis (either features or LDS). Do I understand it correctly?
Hi - thanks for your interest.
It looks like you are using only integers(as you are predicting age) to make a dictionary of histogram bins in both FDS and LDS.
This is not true. The code you referred to is for age prediction, which is a specific case where the minimum resolution we care is 1, thus the bin size is set to 1. However, if you read the paper carefully, there is no constrain on bin size
--- in fact, for some datasets we experimented on, e.g., STS-B-DIR or NYUD2-DIR, the labels are exactly float numbers (e.g., [0, 5]
for STS-B-DIR). The bin size for these datasets are 0.1
in the experiments. You might want to refer to the code for sts-b-dir and nyud2-dir.
That being said, FDS and LDS could be applied to any dataset / deep model, as long as you define the minimum bin size you care about.
I did not find an explanation for this clipping
The reason to clip the weights here is because after inverse re-weighting, some weights might be very large (e.g., in age estimation, consider 5,000 images for age 30, and only 1 image for age 100, then after inverse re-weighting, the weight ratio could be extremely high). This could cause optimization problems.
there is also another clipping here
Similarly, the clipping here is for numerical stability of FDS. If some bins contain a very small number of samples, the variance estimation may not be stable. To avoid optimization problems, we simply use clipping here.
by dividing kernel window with max value, you are changing the mean and variance values and not only smoothing them along axis (either features or LDS)
I do not quite understand your question here. This is just one implementation choice. You could also use gaussian_filter1d
for implementation to simulate a kernel window.
thank you for your answer regarding the clipping! Let me rephrase the other questions:
x = np.random.rand(10)
ks = 5
sigma = 2
half_ks = (ks - 1) // 2
base_kernel = [0.] * half_ks + [1.] + [0.] * half_ks
kernel_window_withmax = gaussian_filter1d(base_kernel, sigma=sigma) / max(gaussian_filter1d(base_kernel, sigma=sigma))
kernel_window = gaussian_filter1d(base_kernel, sigma=sigma)
x_k_withmax = convolve1d(x, kernel_window_withmax)
x_k = convolve1d(x, kernel_window)
plt.plot(x_k_withmax, label="x_k_withmax")
plt.plot(x_k, label="x_k")
plt.plot(x, label="x")
plt.legend()
Thanks for your explanation, we now understand your questions better.
thank you for your answer regarding the clipping! Let me rephrase the other questions:
- The ideas are generally applicable, but the provided code is specific for each use. In order to use it in general approach or with other datasets I have to put it together myself, am I correct? e.g. I was able to find that the lds bin size for this dataset is specified here: https://github.com/YyzHarry/imbalanced-regression/blob/055a7b3804bbaf903ed25a55c11ab8acc6e142e1/sts-b-dir/tasks.py#L51
Yes, your understanding is correct. For every dataset, we implemented a get_bin_idx() function to return the bin index of the regression label. For this function, we assume the label range for each dataset is known and define the number of bins we want to use, then the bin size is naturally defined (yes, you can also infer the bin size from this function). Besides, we also list those settings (e.g., label range, bin size) in detail in Table 7 (Appendix B) of our paper.
- if you use gaussian_filter1d/max(gaussian_filter1d) - as you are in the FDS, then after convolution the feature values are not only smoothed but also their mean value increases, is there any reason for this? eg. try following example:
x = np.random.rand(10) ks = 5 sigma = 2 half_ks = (ks - 1) // 2 base_kernel = [0.] * half_ks + [1.] + [0.] * half_ks kernel_window_withmax = gaussian_filter1d(base_kernel, sigma=sigma) / max(gaussian_filter1d(base_kernel, sigma=sigma)) kernel_window = gaussian_filter1d(base_kernel, sigma=sigma) x_k_withmax = convolve1d(x, kernel_window_withmax) x_k = convolve1d(x, kernel_window) plt.plot(x_k_withmax, label="x_k_withmax") plt.plot(x_k, label="x_k") plt.plot(x, label="x") plt.legend()
In LDS, we use gaussian_filter1d / max(gaussian_filter1d), but in FDS, we use gaussian_filter1d / sum(gaussian_filter1d) to ensure that the mean value scale will not be changed, which is effectively equal to gaussian_filter1d.
But to align with other kernels (e.g., triangle) that indeed require a sum normalization, we also add /sum to gaussian kernel.
thank you both @kaiwenzha and @YyzHarry ! closing he the issue as everything is clear to me now :)
Hi Team, I liked the ideas in your paper, but from reading the paper and provided code it sounds like the provided FDS and LDS code can be applied to any dataset/model? Is it really true?
Note: I like the ideas in the paper, but due to lack of documentation/explanation I am right now spending a lot of time on generalizing the code and trying to figure out why you made some of the operations(eg. clippings)