zotero / zotero

Zotero is a free, easy-to-use tool to help you collect, organize, annotate, cite, and share your research sources.
https://www.zotero.org
Other
10.26k stars 756 forks source link

Attachment URLs are proxied even when they shouldn't be #1612

Closed dstillman closed 5 years ago

dstillman commented 5 years ago

Thanks to some very helpful debugging, I think I've figured out what's breaking PDF downloads for some EBSCOhost users.

We appear to be converting all non-link attachment URLs to proxied URLs:

https://github.com/zotero/zotero/blob/4072d444e74036b74c507696192c18647aa0b156/chrome/content/zotero/xpcom/translation/translate.js#L774-L780

We've done this for many years, presumably because, while some translators extract the download URL from the page, others build it from available metadata, and in the latter case the translator doesn't know whether it needs to be proxied and just builds a normal (or relative) URL. This usually works fine, either because the URL should in fact be proxied like the parent page or because the proxy handles an unnecessarily proxied URL. But on EBSCO, PDF URLs don't need to be proxied and for some — but not all — users the unnecessarily proxied PDF URLs (e.g., content.ebscohost.com.ezproxy2.library.drexel.edu) return 404s.

Two possible fixes:

1) Translators could specify in the attachment object that a URL came from the page and shouldn't be proxied further.

2) Translators could specify in the attachment object that a URL was manually built and should be proxied, and we would otherwise stop automatically proxying URLs.

(2) feels a bit cleaner, since it makes changing the URL an explicit action, but it would break translators that rely on the current behavior. Also, some translators return relative links, and proxying those by default — i.e., matching the current domain — seems like the more natural and expected option. So I think we have to go with (1).

For the flag, some options:

I guess I'm inclined to go with proxy: false unless someone has a better idea.

adam3smith commented 5 years ago

Yes, please go with 1) -- having to include the proxy is very commonly required and having to do this (and especially: fix this) manually for all instances would be quite tedious.

I like the proxy: false flag, but generally don't think it matters a ton as long as its (as you say) well documented.

dstillman commented 5 years ago

Done, and documented: https://www.zotero.org/support/dev/translators/coding#attachments