Coder-Yu / SELFRec

An open-source framework for self-supervised recommender systems.
516 stars 76 forks source link

关于SimGCL中的分布图画法请教 #14

Closed sunshinelium closed 1 year ago

sunshinelium commented 2 years ago

非常抱歉打扰大佬,我最近关注到您在SIGIR22‘发表的SimGCL的工作且对您里边画的分布图非常感兴趣,不知能否获得关于这段画图的代码,提前谢谢大佬了。 image

Coder-Yu commented 1 year ago

核心代码如下:

f, axs = plt.subplots(2, len(embs), figsize=(12,3.5),gridspec_kw={'height_ratios': [3, 1]})
kwargs = {'levels': np.arange(0, 5.5, 0.5)}
for i,name in enumerate(models):
    sns.kdeplot(data=data[name], bw=0.05, shade=True, cmap="GnBu", legend=True, ax=axs[0][i], **kwargs)
    axs[0][i].set_title(name, fontsize=9, fontweight="bold")
    x = [p[0] for p in data[name]]
    y = [p[1] for p in data[name]]
    angles = np.arctan2(y, x)
    sns.kdeplot(data=angles, bw=0.15, shade=True, legend=True, ax=axs[1][i], color='green')

data[name] 为normalize之后的2d向量

seaborn的版本需要低于0.10.1

Coder-Yu commented 1 year ago

embedding降维代码为

from sklearn import manifold
from sklearn.preprocessing import normalize

tsne = manifold.TSNE(n_components=2, init='pca', random_state=501)
user_emb_2d = tsne.fit_transform(user_emb)
user_emb_2d = normalize(user_emb_2d, axis=1, norm='l2')
sunshinelium commented 1 year ago

非常感谢啦

Gaozzzz commented 1 year ago

大佬你好,想要一下这里tsne分布图的完整代码。实在是复现不出来。 希望可以发到邮箱527043469@qq.com 谢谢大佬!!!!!

Coder-Yu commented 1 year ago
import pickle
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from sklearn import manifold
from sklearn.preprocessing import normalize
import seaborn as sns
from math import pi
matplotlib.rcParams['pdf.fonttype'] = 42
matplotlib.rcParams['ps.fonttype'] = 42

ue = open('user', 'rb')
user = pickle.load(ue)
udx = np.random.choice(len(user), 2000)
embs = ['user_lgcn.emb', 'user_sgl.emb', 'user_simgcl.emb']
models = ['LightGCN','SGL-ED','SimGCL']
data = {}

for emb,name in zip(embs,models):
    ue = open(emb, 'rb')
    user_emb = pickle.load(ue)
    selected_user_emb = user_emb[udx]  # np.concatenate([item_emb[idx],user_emb[udx]],axis=0)
    tsne = manifold.TSNE(n_components=2, init='pca', random_state=501)
    user_emb_2d = tsne.fit_transform(selected_user_emb)
    user_emb_2d = normalize(user_emb_2d, axis=1, norm='l2')
    data[name]=user_emb_2d

f, axs = plt.subplots(2, len(embs), figsize=(12,3.5),gridspec_kw={'height_ratios': [3, 1]})
kwargs = {'levels': np.arange(0, 5.5, 0.5)}
for i,name in enumerate(models):
    sns.kdeplot(data=data[name], bw=0.05, shade=True, cmap="GnBu", legend=True, ax=axs[0][i], **kwargs)
    axs[0][i].set_title(name, fontsize=9, fontweight="bold")
    x = [p[0] for p in data[name]]
    y = [p[1] for p in data[name]]
    angles = np.arctan2(y, x)
    sns.kdeplot(data=angles, bw=0.15, shade=True, legend=True, ax=axs[1][i], color='green')

for ax in axs[0]:
    ax.tick_params(axis='x', labelsize=8)
    ax.tick_params(axis='y', labelsize=8)
    ax.patch.set_facecolor('white')
    ax.collections[0].set_alpha(0)
    ax.set_xlim(-1.2, 1.2)
    ax.set_ylim(-1.2, 1.2)
    ax.set_xlabel('Features', fontsize=9)
axs[0][0].set_ylabel('Features', fontsize=9)

for ax in axs[1]:
    ax.tick_params(axis='x', labelsize=8)
    ax.tick_params(axis='y', labelsize=8)
    ax.set_xlabel('Angles', fontsize=9)
    ax.set_ylim(0, 0.5)
    ax.set_xlim(-pi, pi)
axs[1][0].set_ylabel('Density', fontsize=9)

plt.show()
Coder-Yu commented 1 year ago

seaborn版本为 0.10

Gaozzzz commented 1 year ago

感谢!

Brownchen commented 10 months ago

您好,请问这里降维后的二维特征呈圆圈分布,是由于高斯核形成的,还是对特征有什么特殊的处理?正常情况下,二维特征的分布应该是散点分布的吧?

Coder-Yu commented 10 months ago

您好,请问这里降维后的二维特征呈圆圈分布,是由于高斯核形成的,还是对特征有什么特殊的处理?正常情况下,二维特征的分布应该是散点分布的吧?

normalize之后的特征

Panamera2333 commented 5 months ago

您好 我想请问一下user文件的格式是csv文件吗?

Coder-Yu commented 5 months ago

字典 这里不重要 就取个用户数

Fuzzytariy commented 5 months ago

这里如果画出的分布图背景有浅绿色,可能是因为matplotlib的版本太高,seaborn设置为0.10.0后,还需要降低matplotlib的版本,我这里直接将matplotlib设置为了3.5.0版本。

Rah1m2 commented 4 months ago

这里如果画出的分布图背景有浅绿色,可能是因为matplotlib的版本太高,seaborn设置为0.10.0后,还需要降低matplotlib的版本,我这里直接将matplotlib设置为了3.5.0版本。

感谢!