Open himanshuragtah1 opened 10 months ago
Thanks for the suggestion! I'm still on vacation, so will check this out later. Can you add an example URL that this would link to? And maybe a brief explanation for someone unfamiliar with CatalyzeX what this provides that isn't covered by our existing Papers with Code integration yet?
Hope you had a wonderful vacation, and a great start to the new year! :)
Here is an example CatalyzeX url
that corresponds to this ACL paper:
Regarding Papers with Code: In this context, although the functionality is similar — providing open-source implementations available for a paper —CatalyzeX has a larger, fast-growing collection of code implementations (approaching a million) that can be helpful to augment/complement what's currently surfaced for papers on ACL Anthology.
We similarly do so with live integrations on Arxiv and Openreview too.
We're continually crawling Github, Bitbucket, Gitlab, Sourceforge, and various personal/academic/professional webpages, and constantly getting code submissions via our website and popular browser extensions.
Hope this helps clarify, and please let us know if you have any questions. Looking forward to discussing next steps.
@mbollmann @akoehn — just following up here. Any next steps here or anything we can help with to move this forward? :)
@mjpost Do you have an opinion on this feature? I didn’t get around to taking a closer look at this yet, but @himanshuragtah1 says (via e-mail) that they can have a PR ready very quickly if we wanted to integrate this.
There is one question I have: our pwc integration only has a link for code in case we actually do have code. I think that this is a good practice and we also do use this information in publication lists: see the [|||] symbol.
I am not sure how we should handle two data sources here.
Regarding the type of the integration: would you plan to use the same kind of integration (i.e. sending pull requests to add the links) or do you want to add a general javascript widget on the pages?
[chatgpt please insert sorry for late reply boilerplate]
Hi @himanshuragtah1—thanks for submitting the request, and I'm sorry that I've only now been able to look at this.
First, a few questions:
I'm open to this, but it would largely depend on how easy you could make the integration, since we are volunteer run. This includes:
Hi @mjpost — Sorry for the late reply. Thanks for taking the time to have a look at this code integration proposal :)
As suggested, we would actually prefer to have a JS widget that is capable of performing real-time requests to our own server for checking code availability, and then modifying the DOM accordingly from there. With this, we see a couple of advantages:
And of course, we will compactly handle both CX and PWC buttons, by introducing a dropdown like the one we shared in the screenshot above.
@akoehn — Regarding handling two data sources in the publication list: To keep it simple, we’ll just add another icon there. In cases where they don’t have code it will be just one code icon. The end user will benefit from having access to some code to work with and build upon.
Regarding the type of the integration: If possible, we would like to make as few changes as possible in your XML files and codebase in general. In our integration to other providers, like arxiv, we have our own javascript widget that fetches code information on the client side. This helps us always show up-to-date results, apart from simplifying the integration.
Let me know if all this sounds good, and we can open a PR shortly for your review. :)
Added this issue here as instructed by @akoehn
-- I propose an integration with CatalyzeX that finds and links to code implementations for papers. This would be a great enhancement to ACL Anthology's current coverage of code.
We can open a pull request to your repo and can send you that shortly for review.
Here's what it would look like:
In case other sources have code, it can be shown in the dropdown as well.