Closed craig65535 closed 3 years ago
This is really excellent effort. I have a question, though: if Apple updates a file, how will we ever download it? If when a (possibly partial) file exists, we download using If-Unmodified-Since, it seems to me that once we have a complete download, if Apple changes/updates the file server-side, we'll never retrieve it until we remove the downloaded file. Am I missing something?
Hello @gregneagle,
If when a (possibly partial) file exists, we download using If-Unmodified-Since, it seems to me that once we have a complete download, if Apple changes/updates the file server-side, we'll never retrieve it until we remove the downloaded file
Once Apple changes the file server-side, subsequent requests for that file will fail with HTTP error 412 (precondition failed). The user would then have to re-run the script with --ignore-cache
to refresh the file.
We could handle that programmatically, and remove the file/retry on error 412, but felt that was too big of a change for this PR. I thought fixing resume might be enough of an improvement on its own.
Ugh. Feels like trading one undesired result for a different undesired result.
(IOW, right now resume is broken, but you'll get the current version of the file if/when Apple updates it. With your proposed change, resume works, but now you might have out-of-date files and not know it)
I think this is better as it never results in a silently partially-truncated download.
now you might have out-of-date files and not know it
No, the script will fail and the user will see error 412 if any files are out-of-date.
"No, the script will fail and the user will see error 412 if any files are out-of-date." which will result in an avalanche of support questions for me. :-(
I think if you get error 412, you should remove the existing file (partial or complete), and restart the download.
@gregneagle PTAL
Apologies for taking so long; was caught up in important projects at work and forgot this was waiting for me to look at. It looks reasonable to me, and I'm hoping my delay with no follow up changes from you means you haven't discovered any other issues.
If you start the script, select a release, and interrupt the download, the transfer of the partially-downloaded file is not resumed with the script is restarted.
Here, I'm stopping a download with ^C:
And, on restart:
We can see that curl didn't actually complete the download.
It looks like the script uses a command like
/usr/bin/curl -fL --create-dirs -o ./content/downloads/26/37/001-68446/r1dbqtmf3mtpikjnd04cq31p4jk91dceh8/BaseSystem.dmg --compressed -z ./content/downloads/26/37/001-68446/r1dbqtmf3mtpikjnd04cq31p4jk91dceh8/BaseSystem.dmg -C - http://swcdn.apple.com/content/downloads/26/37/001-68446/r1dbqtmf3mtpikjnd04cq31p4jk91dceh8/BaseSystem.dmg
to download the file. Because-z
is specified, theIf-Modified-Since
header is added. Because the file on the server side is older than the date in that header, the web server returns 304 and no file contents.I tried running the curl command above with
-v
to illustrate this:The fix is to use
If-Unmodified-Since
instead ofIf-modified-since
. This allows the resume to complete while ensuring the same copy of the file is being downloaded. If the server side file is newer, the download will fail with HTTP 412 (Precondition Failed), which is good because we don't want to resume an out-of-date download with new content at the end of the file.All that needed to change was prefixing the filename specified in the
-z
option with a-
. The curl man page says: