X-PLUG / Youku-mPLUG

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
Apache License 2.0
268 stars 11 forks source link

how to download videos of pretrain dataset? #6

Open flash-barry opened 1 year ago

flash-barry commented 1 year ago

When I use the download code below, it return the git-lfs.

data = MsDataset.load( 'Youku-AliceMind', namespace='modelscope', download_mode=DownloadMode.FORCE_REDOWNLOAD, # if you need to clean the cache , please use it subset_name='pretrain', split='train', # Options: train, test, validation use_streaming=True)

print(next(iter(data)))

How to download all 36TB video from git-lfs

fanbooo commented 12 months ago

+1