Closed marcospri closed 2 weeks ago
@robertknight @acelaya
Made some changes based on the suggestions:
We check response.content instead of response.text. I don't think it would make a big difference but we are looking for bytes so that more explicit.
In the present, I agree. My comment was mostly about doing the "proper" thing to avoid nasty surprises in future because our code is improperly treating binary data as text and the handling of this changes (see eg. https://github.com/hypothesis/h/pull/8953)
In moodle we can't make a API call to check if a file exists and we have to rely on the URL where file will be downloaded.
When a problem occurs we get a 200 response with a JSON document with the reason of the error.
We don't want to always download the document as it will actually download the full PDF file for success cases, the most common case.
Until now we been relying on the headers of a HEAD HTTP request, interpreting JSON responses as the file being missing.
We have found at least one school that doesn't include the content-type header on the response so he are switch the approach to inspect the first bytes of the response and check if we are getting a PDF back from Moodle.
Here's the headers from the file in the Moodle instance that doesn't include content-type:
Testing
Open: localhost - Moodle File
localhost - Deleted Moodle file
Getting a file on the first and an error in the second.