lihan97 / KPGT

codes for KPGT (Knowledge-guided Pre-training of Graph Transformer)
Apache License 2.0
91 stars 15 forks source link

咨询关于《A knowledge-guided pre-training framework for improving molecular representation learning》论文代码中数据文件缺失的问题 #13

Closed geng007 closed 1 month ago

geng007 commented 2 months ago

在尝试复现您的实验过程中,我遇到了一个关于代码的问题,特此向您请教。当我运行extract_features.py脚本时,系统提示找不到名为rdkfp1-7_512.npz的数据文件。具体的错误信息如下: Traceback (most recent call last): File "extract_features.py", line 61, in extract_features(args) File "extract_features.py", line 30, in extract_features mol_dataset = MoleculeDataset(root_path=args.data_path, dataset=args.dataset, dataset_type=None) File "../src/data/finetune_dataset.py", line 24, in init fps = torch.from_numpy(sps.load_npz(ecfp_path).todense().astype(np.float32)) File "/home/hugeng/miniconda3/envs/pytorch/lib/python3.7/site-packages/scipy/sparse/_matrix_io.py", line 123, in load_npz with np.load(file, **PICKLE_KWARGS) as loaded: File "/home/hugeng/miniconda3/envs/pytorch/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '../datasets/bace/rdkfp1-7_512.npz' 我在您的论文和提供的代码库中未能找到此数据文件的下载链接或生成方法。我想咨询您是否可以提供此数据文件,或者指导我如何生成它? 非常感谢您在百忙之中阅读我的邮件,并期待您的宝贵回复。如有任何需要进一步说明的问题,请随时与我联系。

lihan97 commented 1 month ago

你可以通过类似下面代码来生成rdkfp1-7_512.npz文件: python preprocess_downstream_dataset.py --data_path ../datasets/ --dataset bace

我们更新了README文件增加了这个说明。

感谢指出这个问题!