Open rhigman opened 2 years ago
This should be prioritised soon, to help reduce the number of auto dissemination errors
Prioritising this might be the simplest mitigation to the issue described here.
Note that more than one Content-Type
may be valid for any given Publication Type, e.g. application/octet-stream
is also permissible for PDFs. This may require a change to the thoth-dissemination check if we start hitting issues with it (we haven't so far).
Work on https://github.com/thoth-pub/thoth/issues/405 required downloading PDF publication files directly from Location fullTextUrls, including checking that the URL returned
Content-Type: application/pdf
. This uncovered many user-entered fullTextUrls which instead returnedContent-Type: text/html
. These were fortunately simple to fix via a bulk database update, but would have been onerous for the user to change individually.Perhaps we could/should do a
Content-Type
check when the user tries to save a fullTextUrl, to prevent similar issues in future.