alibaba / clusterdata

cluster data collected from production clusters in Alibaba for cluster management research
1.54k stars 402 forks source link

Problem downloading .tar.gz files #180

Closed kapedalex closed 1 year ago

kapedalex commented 1 year ago

After a simple copy, these files become 1 kilobyte in size with the wrong sha256 code, also unreadable. After cloning, we see the following message in response:

Cloning into 'E:...\clusterdata'... remote: Enumerating objects: 440, done. remote: Counting objects: 100% (119/119), done. remote: Compressing objects: 100% (66/66), done. remote: Total 440 (delta 84), reused 62 (delta 53), pack-reused 321 Receiving objects: 100% (440/440), 22.80 MiB | 72.00 KiB/s, done. Resolving deltas: 100% (195/195), done. Updating files: 100% (79/79), done. Downloading cluster-trace-gpu-v2020/data/pai_group_tag_table.tar.gz (55 MB) Error downloading object: cluster-trace-gpu-v2020/data/pai_group_tag_table.tar.gz (722fef3): Smudge error: Error downloading cluster-trace-gpu-v2020/data/pai_group_tag_table.tar.gz (722fef30b7fb7aa50dabd79155614b5423a9d65cf45a9b26c590d57725423a14): batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.

Errors logged to 'E:...\clusterdata.git\lfs\logs\20230305T175916.1522626.log'. Use git lfs logs last to view the log. error: external filter 'git-lfs filter-process' failed fatal: cluster-trace-gpu-v2020/data/pai_group_tag_table.tar.gz: smudge filter lfs failed warning: Clone succeeded, but checkout failed. You can inspect what was checked out with 'git status' and retry with 'git restore —source=HEAD :/'

As I understand it, the problem is not on my side. Can data be obtained now or in the future? P.S: If it is my fault, could you be so kind to explain, what is the problem?

anoopV12 commented 1 year ago

I am also facing same issue. Can anyone help for this dataset.

blackbaba980 commented 1 year ago

Same here

arashasg commented 1 year ago

same here

ofircohen205 commented 1 year ago

same here

qzweng commented 1 year ago

Due to the recent surge in pulls, this repository is over the LFS quota. We have tried to bypass the problem by providing the traces via Aliyun OSS. The files can be downloaded as follows:

(Alternative: data repo on GitHub)

Thank you for your support and understanding!

qzweng commented 1 year ago

Fixed by commit https://github.com/alibaba/clusterdata/commit/02a1be14f86d8221b53d3cbbde2c2b60ea68b95a