zjunlp / MKGformer

[SIGIR 2022] Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion
MIT License
162 stars 27 forks source link

关于link prediction #4

Closed ZihaoZheng98 closed 2 years ago

ZihaoZheng98 commented 2 years ago

Hi,你们好,请问link prediction部分,看论文好像说对Bert的词表进行了扩充?这个是不是应该上传相应vocab或者config文件?我看代码好像就这里提到了,如果是我没注意到还请告诉我一下,感谢~

vision_config = CLIPConfig.from_pretrained('/home/lilei/package/clip-vit-base-patch32').vision_config
    text_config = BertConfig.from_pretrained('/home/lilei/package/bert-base-uncased')
    bert = BertModel.from_pretrained('/home/lilei/package/bert-base-uncased')
flow3rdown commented 2 years ago

您好,添加实体的过程是在运行过程中完成的,不需要显式更改vocab或config文件。

如更改tokenizer:

num_added_tokens = self.tokenizer.add_special_tokens({'additional_special_tokens': entity_list})

更改模型的embedding:

# resize the word embedding layer
self.model.resize_token_embeddings(len(self.tokenizer))
ZihaoZheng98 commented 2 years ago

好的感谢~,然后提个建议,readme那里,好像写成了bash **.py,应该改一下

flow3rdown commented 2 years ago

好的感谢~,然后提个建议,readme那里,好像写成了bash **.py,应该改一下

已修改,感谢指正

liang-ry commented 2 years ago

请问MKG任务中的WN18-images数据集是哪种形式呀?我按照RSME的教程下载之后,发现训练不了模型。

flow3rdown commented 2 years ago

WN18-images形式如下:

image

子文件夹为entities_id前加一个'n'

liang-ry commented 2 years ago

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

flow3rdown commented 2 years ago

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

liang-ry commented 2 years ago

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

实体embedding是这样的: image 不知道是不是我数据集有问题,我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集,麻烦您了。

flow3rdown commented 2 years ago

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

我这边跑的结果正常,您预训练实体embedding阶段的模型表现如何?

实体embedding是这样的: image 不知道是不是我数据集有问题,我是直接使用KG-Bert里WN18的数据集。请问您是否可以提供一份原始的WN18的数据集,麻烦您了。

WN18的实体和关系数据已经上传到了MKG/dataset/WN18路径下,由于license问题,不方便公开

Maigewm commented 1 year ago

你好,我也遇到了同样的问题,请问到底是哪里出错了呢?怎么解决的呢?

flow3rdown commented 1 year ago

您好,您的问题是指WN18的训练结果很差吗?

Maigewm commented 1 year ago

对的,给您看下我的训练结果,非常奇怪。 image 这是wn18的,hit出奇的高。 image 这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题? 数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

Maigewm commented 1 year ago

我也是按照这个格式,然后子文件夹里的图片是这种'n00004475_0.JPEG'格式。但是跑出来的训练效果始终都是这样 10 ,请问您这边有遇到过这个问题吗

您好,我跟您遇到的情况一样,请问您解决了吗?怎么解决的呢?

flow3rdown commented 1 year ago

对的,给您看下我的训练结果,非常奇怪。 image 这是wn18的,hit出奇的高。 image 这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题? 数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

请问您是根据README中提供的百度云链接下载的数据吗?运行的脚本是否有改动呢?预训练阶段的效果如何呀?

Maigewm commented 1 year ago

对的,给您看下我的训练结果,非常奇怪。 image 这是wn18的,hit出奇的高。 image 这是fbk的,hit@1为0。我已经反复训练多次,都是这种结果,不知道是哪里出的问题? 数据集都是根据您提供的链接下载并解压的,请问是否是预训练的数据集需要改动吗?

请问您是根据README中提供的百度云链接下载的数据吗?运行的脚本是否有改动呢?预训练阶段的效果如何呀?

没错,我都是使用的百度云链接的数据,运行的脚本没有改动,预训练阶段效果也非常差,如下所示: max number of filter entities : 4364 954 convert text to examples: 100%|██████████████████████████████████████████████| 14951/14951 [00:00<00:00, 213264.86it/s] 100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:07<00:00, 1951.95it/s] delete entities without text name.: 100%|███████████████████████████████████| 20466/20466 [00:00<00:00, 1050025.39it/s] total entity not in text : 0 max number of filter entities : 4364 954 convert text to examples: 100%|███████████████████████████████████████████████| 14951/14951 [00:00<00:00, 40823.01it/s] 100%|██████████████████████████████████████████████████████████████████████████| 14951/14951 [00:08<00:00, 1835.56it/s] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3] Testing: 0it [00:00, ?it/s]/home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) /home/wangmeng/miniconda3/envs/MKG/lib/python3.7/site-packages/transformers/feature_extraction_utils.py:158: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484809535/work/torch/csrc/utils/tensor_new.cpp:201.) tensor = as_tensor(value) Testing DataLoader 0: 100%|██████████████████████████████████████████████████████████| 234/234 [03:00<00:00, 1.30it/s] ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test metric DataLoader 0 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Test/hits1 0.0 Test/hits10 0.0006019664236505919 Test/hits20 0.0011370476891177847 Test/hits3 0.0002006554745501973 Test/mean_rank 7485.40171226005 Test/mrr 0.0005941319530979928 ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── [{'Test/hits10': 0.0006019664236505919, 'Test/hits20': 0.0011370476891177847, 'Test/hits3': 0.0002006554745501973, 'Test/hits1': 0.0, 'Test/mean_rank': 7485.40171226005, 'Test/mrr': 0.0005941319530979928}] pathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpathpath

KiteYN commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

zxlzr commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢? 您好请参见 https://github.com/zjunlp/MKGformer/issues/17#issue-1429593813https://github.com/zjunlp/MKGformer/issues/16

KiteYN commented 1 year ago

你好!已收到你的邮件!谢谢!

flow3rdown commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

KiteYN commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到,因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl Traceback (most recent call last): File "", line 1, in File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题,请问您们那边 torchmetrics 的版本号是?

flow3rdown commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

收到,因为修改了pytorch, pytorch_lightning版本后报错了(

import pytorch_lightning as pl Traceback (most recent call last): File "", line 1, in File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics File "/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/slyang/.conda/envs/nlp/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

好像是torchmetrics的问题,请问您们那边 torchmetrics 的版本号是?

torchmetrics==0.7.3

KiteYN commented 1 year ago

大佬您好,我也遇到了fb15k-237使用原始脚本没有改动的情况下预训练结果很差、并导致在做MKGC任务的时候结果也很差的情况,请问是什么问题呢?

如果效果很差的话,应该是环境版本的问题,pytorch, pytorch_lightning和transformers请保持与requirement.txt一致。

谢谢!预训练性能正常了!^_^