princeton-nlp / SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
https://www.swebench.com
MIT License
1.81k stars 311 forks source link

Repository not found while running python3 create_text_dataset.py #98

Closed hsm1997 closed 3 months ago

hsm1997 commented 5 months ago

GITHUB_TOKEN=xxxxx python3 create_text_dataset.py --dataset_name_or_path /data/to/my/swe_bench_orcale --output_dir ./base_datasets --prompt_style style-2 --file_source oracle --split train

I get: 截屏2024-04-18 19 14 01

And I see that the repo indeed does not exist here: https://github.com/orgs/swe-bench/repositories?type=all

How to make the code reproducible?

john-b-yang commented 5 months ago

Hi @hsm1997. Your command from above should now work, although I'm only confident it works for the Qiskit__qiskit repository. There may be many other repositories that are private under the github.com/swe-bench organization.

I think an alternative solution that might be better is to change this line to

repo_url = (
    f"https://{token}@github.com/"
    + instance["repo"]
    + ".git"
)

Hope this helps.

john-b-yang commented 3 months ago

Closing this issue due to inactivity.