paperswithcode / paperswithcode-data

The full dataset behind paperswithcode.com
306 stars 32 forks source link

Paper with multiple repositories #2

Closed nickspell closed 5 years ago

nickspell commented 5 years ago

Hello, I have seen that in the .json that you make available there is only one GitHub link associated to each paper, and most of the time it is not even one of the most relevant repositories. However in the website multiple repos are listed and ordered by stars. Do you know how I can access the full list of repositories (or the best at least)? Thanks for your help.

rstojnic commented 5 years ago

Hi @nickspell can you give a bit more details with examples? All links between papers and code should be in this JSON:

https://paperswithcode.com/media/about/links-between-papers-and-code.json.gz

Granted it doesn't have the current number of stars, but we could add that if useful.

Cheers, Robert

nickspell commented 5 years ago

Hi thanks for the quick answer. I'll close the issue since I have made a mistake analyzing your JSON. I have just seen that there can be multiple entries of the same paper linking to different repositories, e.g.:

{
    "paper_title" : "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset",
    "paper_arxiv_id" : "1705.07750",
    "paper_url_abs" : "http://arxiv.org/abs/1705.07750v3",
    "paper_url_pdf" : "http://arxiv.org/pdf/1705.07750v3.pdf",
    "repo_url" : "https://github.com/ahsaniqbal/Kinetics-FeatureExtractor",
    "mentioned_in_paper" : false,
    "mentioned_in_github" : true
  },
{
    "paper_title" : "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset",
    "paper_arxiv_id" : "1705.07750",
    "paper_url_abs" : "http://arxiv.org/abs/1705.07750v3",
    "paper_url_pdf" : "http://arxiv.org/pdf/1705.07750v3.pdf",
    "repo_url" : "https://github.com/coderSkyChen/Action_Recognition_Zoo",
    "mentioned_in_paper" : false,
    "mentioned_in_github" : true
  }

Maybe it is not super-efficient (everything but the repo_url/mentioned_in is duplicated), but it does the job. I haven't spotted them since they are non-consecutive in the array. Everything is perfectly fine.

As regard the number of stars I can get them from the GitHub API directly so no real need for that, thanks again.

Nicola