Closed JohnMcPMS closed 1 month ago
I see that you’re logging the content type - would there be any value in additional telemetry if the content type is an HTML page? For example - I've noticed that some publishers return the HTML contents of a 403 - Forbidden page on a download attempt, others return the HTML contents of a 404 page. I've also seen some edge cases where the download request results in a 204 - No Content response
I don’t know if any of this would be useful information, but if it is a Zero byte file, perhaps the HTTP status code? If larger than zero bytes and is HTML content, I don’t know if there would be any value in checking the meta tags or searching for 403/404. Just thinking that having additional information on the contents of the webpages could potentially help improve handling in the future. I know that the actual content of the page probably can’t be sent, but information about it such as whether or not it is canonical likely wouldn’t violate any privacy laws?
See the :open_file_folder: files view, the :scroll:action log, or :memo: job summary for details.
MSIXSTRM
Issue
From looking at our hash mismatch data, there were two main modes:
My suspicion is that there are some cases where a successful HTTP response is made, but the content is actually a webpage or similar. This would happen with captive portals or potentially firewalls.
Change
Collect the total size and content type of downloads and add them to our existing hash mismatch telemetry events. Retry downloading when we get a zero-byte file. Report a different error when a zero-byte file is the final result of downloading (also requires the hash to not match, so you can still have a zero-byte installer for what that is worth...).
Validation
Updates the downloader tests for new fields. Adds a new pair of tests for zero-byte file downloads.
Microsoft Reviewers: Open in CodeFlow