X-PLUG / Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
Apache License 2.0
280 stars 11 forks source link

Modelscope下载Youku-mPLUG出现oss2.exceptions.NoSuchKey #16

Open ricardo-young-ui opened 11 months ago

ricardo-young-ui commented 11 months ago

当我使用modelscope库下载Youku-mPLUG数据集时出现了oss2.exceptions.NoSuchKey异常,以下是我的报错信息:

oss2.exceptions.NoSuchKey: {'status': 404, 'x-oss-request-id': '652D2F6BEEC74232305394E8', 'details': {'Code': 'NoSuchKey', 'Message': 'The specified key does not exist.', 'RequestId': '652D2F6BEEC74232305394E8', 'HostId': 'dataset-hub.oss-cn-hangzhou.aliyuncs.com', 'Key': 'public-zip/modelscope/Youku-AliceMind/master/videos/pretrain/14111B1211bJ4551C43BJJbb23Y---3A4Y1b17aE3C5a5JJ-aBY81aA-JE4838YbAF.mp4', 'EC': '0026-00000001', 'RecommendDoc': 'https://help.aliyun.com/zh/oss/support/0026-00000001'}}
Downloading data files:   0%|                                                                                                                  | 0/1 [16:05<?, ?it/s]

这是我是用的python脚本:

from modelscope.hub.api import HubApi
from modelscope.msdatasets import MsDataset
from modelscope.utils.constant import DownloadMode
api = HubApi()
sdk_token = ""  # 必填, 从modelscope WEB端个人中心获取
api.login(sdk_token)  # online
data = MsDataset.load(
    'Youku-AliceMind',
    # download_mode=DownloadMode.FORCE_REDOWNLOAD,    # if you need to clean the cache , please use it
    subset_name="pretrain",
    cache_dir="./data")

print(next(iter(data)))

# Slicing
len(data)
data_new = data[10:15]
for item in data_new:
    print(item)

(1) 我怎样才能正常下载? (2) 是否支持断点续传?毕竟我已经下载部分数据? (3) 找不到的数据是否能够自动跳过,继续下载剩余数据?

Xujianzhong commented 11 months ago

请问你解决这个问题了吗

lucasjinreal commented 3 months ago

@xhyandwyy X-P 2023年代问题还没结局恩饿