关于link prediction - Githubissues

ZihaoZheng98 commented 2 years ago

Hi,你们好，请问link prediction部分，看论文好像说对Bert的词表进行了扩充？这个是不是应该上传相应vocab或者config文件？我看代码好像就这里提到了，如果是我没注意到还请告诉我一下，感谢~

vision_config = CLIPConfig.from_pretrained('/home/lilei/package/clip-vit-base-patch32').vision_config
    text_config = BertConfig.from_pretrained('/home/lilei/package/bert-base-uncased')
    bert = BertModel.from_pretrained('/home/lilei/package/bert-base-uncased')

flow3rdown commented 2 years ago

您好，添加实体的过程是在运行过程中完成的，不需要显式更改vocab或config文件。

如更改tokenizer：

num_added_tokens = self.tokenizer.add_special_tokens({'additional_special_tokens': entity_list})

更改模型的embedding：

# resize the word embedding layer
self.model.resize_token_embeddings(len(self.tokenizer))

ZihaoZheng98 commented 2 years ago

好的感谢~，然后提个建议，readme那里，好像写成了bash **.py，应该改一下

flow3rdown commented 2 years ago

好的感谢~，然后提个建议，readme那里，好像写成了bash **.py，应该改一下

已修改，感谢指正

liang-ry commented 2 years ago

请问MKG任务中的WN18-images数据集是哪种形式呀？我按照RSME的教程下载之后，发现训练不了模型。

flow3rdown commented 2 years ago

WN18-images形式如下：

子文件夹为entities_id前加一个'n'

liang-ry commented 2 years ago

我也是按照这个格式，然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样，请问您这边有遇到过这个问题吗

flow3rdown commented 2 years ago

我也是按照这个格式，然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样，请问您这边有遇到过这个问题吗

我这边跑的结果正常，您预训练实体embedding阶段的模型表现如何？

liang-ry commented 2 years ago

我也是按照这个格式，然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样，请问您这边有遇到过这个问题吗

我这边跑的结果正常，您预训练实体embedding阶段的模型表现如何？

实体embedding是这样的：不知道是不是我数据集有问题，我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集，麻烦您了。

flow3rdown commented 2 years ago

我也是按照这个格式，然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样，请问您这边有遇到过这个问题吗

我这边跑的结果正常，您预训练实体embedding阶段的模型表现如何？

实体embedding是这样的：不知道是不是我数据集有问题，我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集，麻烦您了。

WN18的实体和关系数据已经上传到了MKG/dataset/WN18路径下，由于license问题，不方便公开

Maigewm commented 1 year ago

你好，我也遇到了同样的问题，请问到底是哪里出错了呢？怎么解决的呢？

flow3rdown commented 1 year ago

您好，您的问题是指WN18的训练结果很差吗？

Maigewm commented 1 year ago

对的，给您看下我的训练结果，非常奇怪。这是wn18的，hit出奇的高。这是fbk的，hit@1为0。我已经反复训练多次，都是这种结果，不知道是哪里出的问题？数据集都是根据您提供的链接下载并解压的，请问是否是预训练的数据集需要改动吗？

Maigewm commented 1 year ago

我也是按照这个格式，然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样，请问您这边有遇到过这个问题吗

您好，我跟您遇到的情况一样，请问您解决了吗？怎么解决的呢？

flow3rdown commented 1 year ago

对的，给您看下我的训练结果，非常奇怪。这是wn18的，hit出奇的高。这是fbk的，hit@1为0。我已经反复训练多次，都是这种结果，不知道是哪里出的问题？数据集都是根据您提供的链接下载并解压的，请问是否是预训练的数据集需要改动吗？

请问您是根据README中提供的百度云链接下载的数据吗？运行的脚本是否有改动呢？预训练阶段的效果如何呀？

Maigewm commented 1 year ago

对的，给您看下我的训练结果，非常奇怪。这是wn18的，hit出奇的高。这是fbk的，hit@1为0。我已经反复训练多次，都是这种结果，不知道是哪里出的问题？数据集都是根据您提供的链接下载并解压的，请问是否是预训练的数据集需要改动吗？

请问您是根据README中提供的百度云链接下载的数据吗？运行的脚本是否有改动呢？预训练阶段的效果如何呀？

没错，我都是使用的百度云链接的数据，运行的脚本没有改动，预训练阶段效果也非常差，如下所示： max number of filter entities : 4364 954 convert text to examples: 100%|██████████████████████████████████████████████| 14951/14951 [00:00<00:00, 213264.86it/s] 100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:07<00:00, 1951.95it/s] delete entities without text name.: 100%|███████████████████████████████████| 20466/20466 [00:00<00:00, 1050025.39it/s] total entity not in text : 0 max number of filter entities : 4364 954 convert text to examples: 100%|███████████████████████████████████████████████| 14951/14951 [00:00<00:00, 40823.01it/s] 100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:08<00:00, 1835.56it/s] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3] Testing: 0it [00:00, ?it/s]/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) Testing DataLoader 0: 100%|██████████████████████████████████████████████████████████| 234/234 [03:00<00:00, 1.30it/s] ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test/hits1 0.0 Test/hits10 0.0006019664236505919 Test/hits20 0.0011370476891177847 Test/hits3 0.0002006554745501973 Test/mean_rank 7485.40171226005 Test/mrr 0.0005941319530979928 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [{'Test/hits10': 0.0006019664236505919, 'Test/hits20': 0.0011370476891177847, 'Test/hits3': 0.0002006554745501973, 'Test/hits1': 0.0, 'Test/mean_rank': 7485.40171226005, 'Test/mrr': 0.0005941319530979928}] pathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpath

KiteYN commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？

zxlzr commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？您好请参见 https://github.com/zjunlp/MKGformer/issues/17#issue-1429593813 和 https://github.com/zjunlp/MKGformer/issues/16

KiteYN commented 1 year ago

你好！已收到你的邮件！谢谢！

flow3rdown commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？

如果效果很差的话，应该是环境版本的问题，pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

KiteYN commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？

如果效果很差的话，应该是环境版本的问题，pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到，因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl Traceback (most recent call last): File "", line 1, in File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题，请问您们那边 torchmetrics 的版本号是？

flow3rdown commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？

如果效果很差的话，应该是环境版本的问题，pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到，因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl Traceback (most recent call last): File "", line 1, in File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题，请问您们那边 torchmetrics 的版本号是？

torchmetrics==0.7.3

KiteYN commented 1 year ago

大佬您好，我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况，请问是什么问题呢？

如果效果很差的话，应该是环境版本的问题，pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

谢谢！预训练性能正常了！^_^

zjunlp / MKGformer

关于link prediction #4