zotero / translators

Zotero Translators
http://www.zotero.org/support/dev/translators
1.3k stars 761 forks source link

Update arXiv translator to use recommended Atom API instead of OAI #3366

Closed thebluepotato closed 1 month ago

thebluepotato commented 1 month ago

Based on various tests, the currently used oai2 endpoint is very slow (up to 20s for a single query). Conversely, the endpoint documented by arXiv is much faster. This is currently a WIP.

Seems to be the similar idea as #3168

AbeJellinek commented 1 month ago

Thanks! I'd hold off on any further changes here for a sec because I do think we want to get #3168 merged. I'll try to do that this week. (The oai2 endpoint is really, really slow right now, but I don't remember that always being the case...)

AbeJellinek commented 1 month ago

That said, if you could rebase on #3168, we could just do everything here.

adam3smith commented 1 month ago

I've found the oai2 endpoint the least reliable arXiv API option for quite some time, so it'd be nice to switch away from it. Last time I looked, data quality wasn't exactly the same, but that was quite some time back.

thebluepotato commented 1 month ago

I've found the oai2 endpoint the least reliable arXiv API option for quite some time, so it'd be nice to switch away from it. Last time I looked, data quality wasn't exactly the same, but that was quite some time back.

In terms of data quality, it seems that for at least one of the test cases, the OAI endpoint contained a "published" DOI whereas the Atom endpoint did not

AbeJellinek commented 1 month ago

This is looking great. I'm getting more and more timeouts from the old export endpoint, so I'd love to get it merged.

@adam3smith, what do you think?

thebluepotato commented 1 month ago

Note that https://github.com/zotero/utilities/blob/e00d98d3a11f6233651a052c108117cf44873edc/utilities.js#L435 should be updated after this PR is merged since the new endpoint explicitly does support versions.

AbeJellinek commented 1 month ago

OK, I think this is ready. @dstillman or @adam3smith, would appreciate a third opinion before we merge.

AbeJellinek commented 1 month ago

Thank you so, so much! This is a huge improvement.