awsm-research / A3Test

16 stars 12 forks source link

Dataset - Git LFS Issue #1

Open boraelci opened 1 year ago

boraelci commented 1 year ago

I am not able to access the dataset files because when I do git lfs pull, I run into the following error:

batch response: This repository is over its data quota.
Account responsible for LFS bandwidth should purchase more data packs to restore access.
Failed to fetch some objects from 'https://github.com/awsm-research/A3Test.git/info/lfs'

How can I go about resolving it? Or, is there another link I can use to download the dataset?

itssrinath commented 1 year ago

I am not able to access the dataset files because when I do git lfs pull, I run into the following error:

batch response: This repository is over its data quota.
Account responsible for LFS bandwidth should purchase more data packs to restore access.
Failed to fetch some objects from 'https://github.com/awsm-research/A3Test.git/info/lfs'

How can I go about resolving it? Or, is there another link I can use to download the dataset?

As per the SHA256 hashes of the files, they were originally taken from here: https://github.com/microsoft/methods2test/tree/main/corpus/json

However, I do not know how these were actually used to create the "training.csv" file that's used as the input file in the commands shown in the main README. If you do figure it out, please let me know. I think it must quite easy to modify the script to use the JSONs as it is though..

raed19 commented 1 year ago

I am run into the same issue. I do not know how can I download training and evaluation dataset mentioned in Readme

livmortis commented 8 months ago

the storage of lfs is full and need author to purchase extra, or clear the repository.

poojabiradar1 commented 3 months ago

Hello is anyone able to download the dataset yet? I do not understand how the files are extracted from .tar file. When I try to extract them shows me files are corrupted. Can someone who is also working on it please help me with this.

Xhcgks211 commented 2 months ago

Hello is anyone able to download the dataset yet? I do not understand how the files are extracted from .tar file. When I try to extract them shows me files are corrupted. Can someone who is also working on it please help me with this.

I have same problem. When I extract json content and convert to train.csv by extractContentDataInCsv.py, I found no JSON files in the dataset