Closed simonw closed 2 years ago
https://www.python-httpx.org/exceptions/#the-exception-hierarchy shows the exception hierarchy:
I want to retry on any form of TransportError
i think - no point retrying a DecodingError
or a TooManyRedirects
error.
Sadly it looks like httpx
itself has decided not to implement retry logic, so I need to build this myself:
While testing this I'm going to want to see if any transport errors have occurred - I think I'll add a -v/--verbose
flag to the google-drive-to-sqlite files
command.
I'm only going to retry GET, I won't retry POST.
Now manually testing this by running:
google-drive-to-sqlite files --folder 1E6Zg2X2bjjtPzVfX8YqdXZDCoB3AVA7i --nl --verbose > all-files.json-nl.txt
And keeping an eye on it while it runs with:
watch 'wc -l all-files.json-nl.txt && ls -lah all-files.json-nl.txt'
Started it running at 4:31pm.
It's at 37223 all-files.json-nl.txt
and 49MB now, 25 minutes after starting.
That actually worked! 162M file resulted, with no errors.
Now running this to see what happens:
time google-drive-to-sqlite files all-files.db --import-nl all-files.json-nl.txt
43.24s user 94.07s system 71% cpu 3:13.06 total
Produced a 80MB SQLite file, thanks presumably to the owners
data being de-duplicated.
I'm suspicious of the 14,100 rows in the drive_users
table.
Confirmed, something went very wrong there:
88 rows where permissionId is not null, 14,012 rows where permissionId is null.
Fixed that bug:
Got this exception:
Would be good to retry once if this happens.