QUVA-Lab / COMBO

Other
45 stars 18 forks source link

Adjacency matrix for Centroid and PestControl task #6

Open KamilDre opened 2 years ago

KamilDre commented 2 years ago

Hi,

I have noticed that in multiple_categorical.py you define the adjacency matrices for both the Centroid and PestControl task as if the variables were ordinal and not categorical. Is there any particular reason for this? Also, I have noticed that if I change the adjacency matrix in lines 39 and 137 to adjmat = torch.ones((n_v, n_v)).fill_diagonal_(0) then the output of the diffusion kernel is full of ones irrespective of the input. Would you mind shedding some light on this?

ChangYong-Oh commented 2 years ago

Hi, about adj. mat. it was a typo. Although the adj. mat. can be determined by users, as you pointed out, using complete adj. mat. seems better.

About getting zero diffusion kernels, I am not so sure why this is happening, maybe you can check how numerical stability affects here.

On Tue, Mar 8, 2022 at 1:12 AM KamilDre @.***> wrote:

Hi,

I have noticed that in multiple_categorical.py you define the adjacency matrices for both the Centroid and PestControl task as if the variables were ordinal and not categorical. Is there any particular reason for this? Also, I have noticed that if I change the adjacency matrix in lines 39 and 137 to adjmat = torch.ones((n_v, n_v)).filldiagonal(0) then the output of the diffusion kernel is full of ones irrespective of the input. Would you mind shedding some light on this?

— Reply to this email directly, view it on GitHub https://github.com/QUVA-Lab/COMBO/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJKCMHHVM2TVTHI3S7B2PTU6YTFNANCNFSM5QDX4WEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

KamilDre commented 2 years ago

I have put together this small example that better illustrates what I was referring to in the previous message. In the code below, n_v is the number of categories for the specific categorical variable. The factor_gram is essentially the Gram matrix on all verticies.

import torch

if __name__ == '__main__':
    n_v = 10  # number of categories
    dtype = torch.float64

    # Eigen decomposition of the graph laplacian
    adjmat = torch.ones((n_v, n_v), dtype=dtype).fill_diagonal_(0)
    laplacian = torch.diag(torch.sum(adjmat, dim=0)) - adjmat
    fourier_freq, fourier_basis = torch.linalg.eigh(laplacian, UPLO='U')

    # Calculate the factor_gram
    freq_transform = torch.exp(- fourier_freq)
    factor_gram = torch.matmul(fourier_basis * freq_transform.unsqueeze(0), fourier_basis.t())

Now, irrespective of the dtype, the maximum difference between any two elements in the factor gram decreases very sharply to 0 as n_v increases. The graph below illustrates this (note the log scale on the y-axis).

test

Consequently, as you can see, the factor graph is approximately full of ones irrespective of the inputs for categorical variables with a handful of categories. Do you have any advice on how to overcome this and apply COMBO to problems with categorical variables with more than a handful of categories? From my investigation I noticed that the primary cause of this is that as n_v increases the difference between the smallest eigenvalue of the laplacian and the next smalless eigenvalue increases. Hence, when you do freq_transform = torch.exp(- fourier_freq), freq_transform essentially has a 1 in the first entry and approximately 0 in the remaining entries. And I found that scaling the adjacency matrix does not necessarily help as it simply scales all of the eigenvalues.

ChangYong-Oh commented 2 years ago

In case of complete graph, its eigenvalues are known as shown in https://saadquader.wordpress.com/2013/04/25/eigenvalues-of-the-laplacian-matrix-of-the-complete-graph/

When there are 'n' vertices, you have eigenvalues of 'n' of multiplicity n-1 and 0 of multiplicity 1. As 'n' increases due to the exponential in the diffusion kernel, freq_transform becomes a matrix very close to zero matrix. To prevent this one, you can normalize the adjacency matrix by dividing by 'n', or equivalently normalize fourier_freq. Normalizing fourier_freq is equivalent to adjusting your prior because this is multiplied by 'beta'.

I checked the code snippet you provided with some modifications as below, it seems numerically stable.

import torch

if name == 'main': n_v = 100 # number of categories dtype = torch.float64

# Eigen decomposition of the graph laplacian
adjmat = torch.ones((n_v, n_v), dtype=dtype).fill_diagonal_(0) / n_v
laplacian = torch.diag(torch.sum(adjmat, dim=0)) - adjmat
fourier_freq, fourier_basis = torch.linalg.eigh(laplacian, UPLO='U')

# Calculate the factor_gram
print(torch.unique(fourier_freq))
freq_transform = torch.exp(- fourier_freq)
factor_gram = torch.matmul(fourier_basis *

freq_transform.unsqueeze(0), fourier_basis.t()) / torch.mean(freq_transform) print(float(torch.max(factor_gram)), float(torch.min(factor_gram)))

This was not a big issue in the experiments in COMBO where each categorical variable has a relatively small number of categories.

If you want to apply COMBO to a set of categorical variables each of which has a relatively large number of categories, normalizing adj.mat = normalizing freq. = adjusting beta prior. seems necessary for numerical stability.

I am not sure how much prior knowledge you have on the relationships among categories of a categorical variable. If you have some, you can utilize that information in adj. mat (with appropriate normalization for numerical stability) For an example, you can refer to D.2 in the supplementary material in https://arxiv.org/pdf/2102.12792.pdf

On Tue, Mar 8, 2022 at 6:58 PM KamilDre @.***> wrote:

I have put together this small example that better illustrates what I was referring to in the previous message. In the code below, n_v is the number of categories for the specific categorical variable. The factor_gram is essentially the Gram matrix on all verticies.

import torch

if name == 'main': n_v = 10 # number of categories dtype = torch.float64

# Eigen decomposition of the graph laplacian
adjmat = torch.ones((n_v, n_v), dtype=dtype).fill_diagonal_(0)
laplacian = torch.diag(torch.sum(adjmat, dim=0)) - adjmat
fourier_freq, fourier_basis = torch.linalg.eigh(laplacian, UPLO='U')

# Calculate the factor_gram
freq_transform = torch.exp(- fourier_freq)
factor_gram = torch.matmul(fourier_basis * freq_transform.unsqueeze(0), fourier_basis.t())

Now, irrespective of the dtype, the maximum difference between any two elements in the factor gram decreases very sharply to 0 as n_v increases. The graph below illustrates this (note the log scale on the y-axis).

[image: test] https://user-images.githubusercontent.com/32391209/157212082-90e4c2e5-3c30-41d2-8ff5-47e4e1d7e510.png

Consequently, as you can see, the factor graph is approximately full of ones irrespective of the inputs for categorical variables with a handful of categories. Do you have any advice on how to overcome this and apply COMBO to problems with categorical variables with more than a handful of categories? From my investigation I noticed that the primary cause of this is that as n_v increases the difference between the smallest eigenvalue of the laplacian and the next smalless eigenvalue increases. Hence, when you do freq_transform = torch.exp(- fourier_freq), freq_transform essentially has a 1 in the first entry and approximately 0 in the remaining entries. And I found that scaling the adjacency matrix does not necessarily help as it simply scales all of the eigenvalues.

— Reply to this email directly, view it on GitHub https://github.com/QUVA-Lab/COMBO/issues/6#issuecomment-1061601140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJKCMGRO3HEU5SATF5PB2LU64QFXANCNFSM5QDX4WEA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

KamilDre commented 2 years ago

Thanks a lot for your detailed answer. Indeed with your modifications the code is more numerically stable and the problem disappears.