Closed xingjian-zhang closed 1 year ago
This is an automatic reminder for pasting the local test results of wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here.
The local test result for each dataset can be obtained by running make pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
This is an automatic reminder for pasting the local test results of
wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here. The local test result for each dataset can be obtained by runningmake pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
This is expected as we are modifying all datasets by removing the urls.json.
Pytests failed: log. Fails contain two parts:
KeyError: 'predict_tail'
for all KGEntityPrediction
tasks.I think the tests failure are triggered by previous code.
This is an automatic reminder for pasting the local test results of wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here.
The local test result for each dataset can be obtained by running make pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
I tried to run the following code and successfully got the url for snap-patents.npz:
from gli.utils import _get_url_from_server
print(_get_url_from_server('snap_patents.npz'))
The result is 'https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0'
.
Maybe the HTTPS server is unstable?
I tried to run the following code and successfully got the url for snap-patents.npz:
from gli.utils import _get_url_from_server print(_get_url_from_server('snap_patents.npz'))
The result is
'https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0'
.Maybe the HTTPS server is unstable?
In [1]: from gli.utils import _get_url_from_server
In [2]: from gli import get_gli_graph
In [3]: for i in range(5):
...: print(_get_url_from_server('snap_patents.npz'))
...:
https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0
https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0
https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0
https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0
https://www.dropbox.com/s/yplq00csa3vyogp/snap_patents.npz?dl=0
In [4]: get_gli_graph('snap-patents')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In [4], line 1
----> 1 get_gli_graph('snap-patents')
File ~/Projects/Private/gli/gli/dataloading.py:139, in get_gli_graph(dataset, device, verbose)
137 if not os.path.exists(metadata_path):
138 raise FileNotFoundError(f"{metadata_path} not found.")
--> 139 download_data(dataset, verbose=verbose)
141 return read_gli_graph(metadata_path, device=device, verbose=verbose)
File ~/Projects/Private/gli/gli/utils.py:367, in download_data(dataset, verbose)
365 data_file_url_dict[data_file] = url_dict[data_file]
366 else:
--> 367 raise FileNotFoundError(f"cannot find url for {data_file}.")
369 for data_file_name, url in data_file_url_dict.items():
370 data_file_path = os.path.join(data_dir, data_file_name)
FileNotFoundError: cannot find url for snap-patents.npz.
I can fetch the url directly by calling _get_url_from_server
but failed to fetch it when calling it inside get_gli_graph()
. This is unexpected. Let me have a closer look into this issue.
This is an automatic reminder for pasting the local test results of wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here.
The local test result for each dataset can be obtained by running make pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
Fixed predict_tail
error via #468.
Found the bug:
snap_patents.npz
exists in remote.metadata.json
uses snap-patents.npz
, which does not exist in remote storage.twitch-gamers
and arxiv-year
share the same issue. I have temporarily fixed them by modifying corresponding metadata.json
manually. This vulnerability would be resolved in the future when we enforce function-based interface to contribute dataset.
This is an automatic reminder for pasting the local test results of wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here.
The local test result for each dataset can be obtained by running make pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
This is an automatic reminder for pasting the local test results of wiki
as a comment in this PR, in case you haven't done so. The aforementioned datasets are too large for them to be tested with GitHub Action workflow here.
The local test result for each dataset can be obtained by running make pytest DATASET=<dataset name>
. For more details, please refer to the dataset submission guide.
Description
Related Issue
This PR attempts to fix #462, #425, and #398.
Motivation and Context
How Has This Been Tested?
This change does not involve source code change.