Closed DiagonalArg closed 4 years ago
Here is how I'm picking up messages that failed to download on a first pass. I do the initial run, and send the terminal output to "$listname".output. When it's done, I run:
#! /bin/bash
TCOOKIE="..."
YCOOKIE="..."
retrylist=$(egrep 'ERROR.*message' "$listname".output | awk '{ print $11 }' | uniq | tr '\n' ' ' | sed 's/ $//')
python3 yahoo.py -ct "$TCOOKIE" -cy "$YCOOKIE" --ids $retrylist -e "$listname"
Of course if the run stopped due to a network error, for instance, then this script won't help. It needs every message to have been either downloaded or to have produced an ERROR in the terminal output.
The code in the latest master version of the tree should mostly only download content that's missing from an existing download. (I say mostly, as I think a few of the archive types are missing this functionality). Please reopen if this is not the case.
Note that there are various reasons that downloads can fail. Here are a couple of smaller groups, for example:
Full output is at these links, but note that does not include standard error. (One of them crashed at the end when trying to get the calendar. I'll report that separately.)
https://framadrop.org/r/sHt00OBEu2#RvarM7guD7nS0ewHqTXG+jRHJWjeMadCspsW6LkASlM= https://framadrop.org/r/sHQBU7bVMk#4mkc7im3l7FjGXPCCdKtQFohRs2S7P1o1/Zs7oSF4IA= [Expires in 7 days.]
For the smaller of these two groups, the problem was a network error on my end; the other group I haven't looked at yet.
Right now I have to dig through the output to then try to redownload --ids. I really suggest that on rerun of the script for a group that's already been downloaded, it should not redownload already downloaded messages, files etc. There is no point in clobbering what is already there, and this will give an easy way to retry on failures.