TrustedLLM / LLMDet

LLMDet is a text detection tool that can identify which generated sources the text came from (e.g. large language model or human-write).
MIT License
48 stars 9 forks source link

ConnectionError: Couldn't reach 'xsum' on the Hub (ProxyError) #1

Open QiShanZhang opened 1 year ago

QiShanZhang commented 1 year ago

datasets 2.13.1 pyhd8ed1ab_0 conda-forge i cant run the dataset.py please help me. thank you very much.

cansee5 commented 1 year ago

If you only need to use the detection feature of "llmdet" and don't need to worry about the dataset issue, you can simply follow the methods described in the README.md to use it.

To address your issue, you can use the following methods for resolution: If you encounter the "cannot connect" issue while using datasets.load_dataset() to download on the server, even when connected to the internet, you can try downloading and saving it locally on your personal computer using the following code:

from datasets import load_dataset
dataset = load_dataset('xsum')
dataset.save_to_disk('./xsum')

Afterward, you can upload the dataset to the server, and then you will be able to load the dataset on the server, avoiding issues related to unstable or unresponsive network connections during online downloads. Here is the usage method for loading datasets on the server:

from datasets import load_from_disk
raw_dataset = load_from_disk("./xsum")