lanec / zoom-batch-downloader

Download all your zoom cloud recordings
GNU General Public License v3.0
42 stars 20 forks source link

"Failed to download file" errors #18

Closed mingomongo closed 10 months ago

mingomongo commented 1 year ago

When downloading, I'm occasionally getting an error appearing after the download of a video finishes. The first few times I got around it just by running the script again a few times, though now I'm stuck on a video that won't download. The video's only 4MB big. The videos that it got stuck on earlier also seemed to fail if they had just downloaded a text file? Not sure if coincidence or not.

The error I get looks below, verbose output didn't give any more info:

Error: Failed to download file at https://us06web.zoom.us/rec/download/p1RoWZeC4htlv54eHXbBIKD8NYtVJNDnLwbqc8D4BEwQjfQifgncaFU5SpuCRuAhgqGK9ti3PquUH0bg.vXAgKXWjZW4taqH_?access_token=eyJzdiI6IjAwM (token cut short).

I was able to download the video it got stuck on by clicking the link that error provides, so something's seemingly up with the code.

AnessZurba commented 1 year ago

Interesting This happens when the size of the file that gets downloaded doesn't match the size declared by Zoom servers before downloading. I didn't know whether this is is possible so I added the link to the message so that users can get unstuck by downloading it directly.

I'll try to investigate it. Thanks for reporting

AnessZurba commented 1 year ago

I added more info to the error message I'd appreciate it if you could reproduce and print the error here (after removing the token of course)

mingomongo commented 1 year ago

Here's the error, I tried a different day and it's getting caught on one video here too. The loading bar finishes then gives this error:

`C:\Users\Xxxxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py Users filter is active ['Xxx.Xxx@Xxxxxx.com', 'Xxx@Xxxxxx.com']

Downloading videos from user Xxx.Xxx@Xxxxxx.com - Starting at 2023-07-28 and up to (inclusive) 2023-07-28.

Found: https://us06web.zoom.us/rec/download/5oQnOpvFx5br-i02TFPgD6bW1civrOXpAJTXb9m0vIsKydQVolFlIlPN4w7Gp0gR_OJulhvh02M2RsTH.AU3lKksMLw4Vj5Ru Downloading: online-stretch-for-dance-stuart-thomas-all-7-w-stuart-thomas2023-07-28t181730zshared_screen_with_speaker_view.mp4 0%| | 0.00B/645MB [00:30<?, ?B/s]

Error: Failed to download file at https://us06web.zoom.us/rec/download/5oQnOpvFx5br-i02TFPgD6bW1civrOXpAJTXb9m0vIsK[etc]

Press Enter to exit...`

AnessZurba commented 1 year ago

and please use the latest version

Message ID: @.***>

mingomongo commented 1 year ago

`C:\Users\Xxxxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py Users filter is active ['Xxx.Xxx@Xxxxxx.com', 'Xxx@Xxxxxx.com']

Downloading videos from user Xxx.Xxx@Xxxxxx.com - Starting at 2023-07-28 and up to (inclusive) 2023-07-28.

Found: https://us06web.zoom.us/rec/download/SUIlGuYMhgi61IQw2xwnrezdj836jxerwhiuc5-t1HWTTAOh6gVzcLZfyZo4DnvxM0k4XgfMJpVmlt28.lVr7UhiAHaqXi5YQ Downloading: online-stretch-for-dance-stuart-thomas-all-7-w-stuart-thomas2023-07-28t181730zshared_screen_with_speaker_view.mp4 0%| | 0.00B/645MB [00:30<?, ?B/s]

Error: Failed to download file at https://us06web.zoom.us/rec/download/SUIlGuYMhgi61IQw2xwnrezdj836jxerwhiuc5-t1HWTTAOh6gVzcLZfyZo4DnvxM0k4XgfMJpVmlt28.lVr7UhiAHaqXi5YQ?access_token=eyJzdiI6IjAwMDAw[etc], expected size: 675987686, actual size: 675815424

Press Enter to exit... `

AnessZurba commented 1 year ago

Just to clarify

AnessZurba commented 1 year ago

OK So in principle this is a bug on Zoom side.

I did add a workaround in the script. You now have a new parameter FILE_SIZE_MISMATCH_TOLERANCE you can configure in config.py to control how much you want to tolerate mismatches in file sizes between the stuff Zoom servers declare and the stuff you actually get. The default value is zero which is the current behavior, you can increase it to fit your use-case

please note that this update breaks backwards compatibility in the sense that the script will not skip any file downloaded by a previous version of itself.

mingomongo commented 1 year ago

Just to clarify

  • Did you encounter a case where this error would disappear on a retry? or is it consistent on the files it occurs on
  • Did you notice the file being downloaded twice before this error appears? i.e the loading bar gets filled twice?

Yes there error I pasted here was for a consistent error, but I encountered some errors I could get past on a retry. The loading bar fills normally, then empties, then error. There could(?) be a frame where it flashes 100% again but too quick to tell. This is all using the older version I posted the error with.

AnessZurba commented 1 year ago

What are the errors you could get past on a retry? "failed to download..." or something else?

mingomongo commented 1 year ago

I would get the same error I believe. I just updated the software and it downloaded fine, I think it might be something to do with the path being too long, or the fact I'm downloading into a Google Drive virtual drive. I won't be able to test things until next week.

I also noticed that the OUTPUT_PATH doesn't like addresses on my C drive when I do them from C: (e.g. C:\Users\Xxxx Studio 10\Documents\Test), but when I do "Documents\Test" it works fine?

AnessZurba commented 1 year ago

weird I made all the strings into raw python strings to avoid problems related to special characters regarding the long paths, please refer to the README to enable long paths on windows

In any case, It's very weird that the size mismatch is not consistent always. There is a risk of corruption here. It might be also the case that Google drive is not reporting real file sizes

I'm really interested in understanding what is happening here. Because it should not happen

mingomongo commented 1 year ago

more strange things with OUTPUT_PATH. See below for first download try onto a long path (G:\Shared drives\Zen Box\Zoom Recordings\Downloads July 27 2023 Onwards) that worked until it ran into a large size discrepancy, then I tried again on a shorter path (G:\Shared drives\Zen Box\test) just to test if I'd get the error again and the app didn't recognise it at all? These are Google Drive drives btw:

`C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py Users filter is active ['xxxx@xxxx.com', 'xxxx@xxxx.com']

Downloading recordings from user xxxx@xxxx.com - Starting at 2023-07-28 and up to 2023-07-28 (inclusive).

Found: https://us06web.zoom.us/rec/download/uEwb4-zIgoakptSsXIf4eN5lhSdr28ZbBf1h[etc] Downloading: online-stretch-for-dance-stuart-thomas-all-7-w-stuart-thomas2023-07-28t181730zshared_screen_with_speaker_view__eccd7d19.mp4 100%|█████████████████████████████████████████████████████████████████████████████| 645MB/645MB [00:34<00:00, 19.8MB/s]

Found: https://us06web.zoom.us/rec/download/YvQwmxLY-IaOISQhi6KZuQAh_54uWb9w9S[etc] Downloading: online-classical-ballet-inga-george-int-7-w-dmitri-gruzdev2023-07-28t180423zshared_screen_with_speaker_view__107ec979.mp4 100%|██████████████████████████████████████████████████████████████████████████▋| 68.1MB/68.4MB [00:04<00:00, 22.2MB/s]Size mismatch: Expected 71700173 bytes but got 71311360. Size difference: 379.7KB. You might want to increase FILE_SIZE_MISMATCH_TOLERANCE in config.py 0%| | 0.00B/68.4MB [00:04<?, ?B/s]

Error: Failed to download file at https://us06web.zoom.us/rec/download/YvQwmxLY-IaOISQhi6KZuQAh_54uWb9w9SmmkX4JPKK_1aPKzhtAKGKuhgVaAHbgaePS9M4qVF5BGet-.sk6JtSHQXZa-Q5U5?access_token=eyJzdiI6IjA[etc].

Press Enter to exit...

C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py Users filter is active ['xxxx@xxxx.com', 'xxxx@xxxx.com']

Downloading recordings from user xxxx@xxxx.com - Starting at 2023-07-28 and up to 2023-07-28 (inclusive).

Found: https://us06web.zoom.us/rec/download/t5_SaXnxeNREvHKAUT0RSABFnsrpkteR[etc]

Error: [WinError 2] The system cannot find the file specified: 'G:\Shared drives\Zen Box\test'

Press Enter to exit...`

AnessZurba commented 1 year ago

Please use the latest version The size discrepancy is not too high. It might work if you set the size tolerance to 1 MB (I do think it's Google drive fault, but I have to create a shared folder to test that)

Regarding the path, are you able to access said path using the terminal? try opening command prompt and then accessing the file from it, does it work?

mingomongo commented 1 year ago

The size tolerance is working well and there doesn't seem to be output_path issues with the update. Coincidentally, the videos that have size mismatches are also the videos I can't play in the download folder and have to drag to the desktop, so maybe there's something weird with the path or file name length, either way I'm happy it's downloading. Also I checked and I was able to navigate to and open files in Google Drive using cmd.

I did notice the date selection seems to have changed, I used to be able to put the same date as the start and end and it would download just that day, but now it has to be different dates so I have to download 2 days minimum.

The output below shows doing the same start and end day, then different days without enabling the size mismatch just in case it's useful to you:

C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py
Users filter is active ['xxxx@xxxx.com', 'xxxx@xxxx.com']

Downloading recordings from user xxxx@xxxx.com - Starting at 2023-07-28 and up to 2023-07-28 (inclusive).
######################################################################

Downloading recordings from user xxxx@xxxx.com - Starting at 2023-07-28 and up to 2023-07-28 (inclusive).
######################################################################

Downloaded 0 files. Total size: 0B. Skipped: 0 files.

C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader>python zoom_batch_downloader.py
Users filter is active ['xxxx@xxxx.com', 'xxxx@xxxx.com']

Downloading recordings from user xxxx@xxxx.com - Starting at 2023-07-28 and up to 2023-07-29 (inclusive).

Found: https://us06web.zoom.us/rec/download/g0uFGu27i_POkzKpE_BPLi[etc]
Downloading: online-russian-ballet-dmitri-gruzdev-gen-7-w-dmitri-gruzdev__2023-07-28t085511z__shared_screen_with_speaker_view__610fb91f.mp4
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.37GB/1.37GB [01:09<00:00, 21.3MB/s]

Found: https://us06web.zoom.us/rec/download/FBCJehJO8flBr9MMmvDCW24[etc]
Downloading: online-classical-ballet-mariana-gomes-intadv-7-w-mariana-gomes__2023-07-28t103314z__shared_screen_with_speaker_view__2204babb.mp4
 93%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊          | 2.52MB/2.72MB [00:01<00:00, 4.11MB/s]Size mismatch: Expected 2848388 bytes but got 2629632. Size difference: 213.63KB.
You might want to increase FILE_SIZE_MISMATCH_TOLERANCE in config.py
  0%|                                                                                                                                                  | 0.00B/2.72MB [00:01<?, ?B/s]

Traceback (most recent call last):
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 268, in <module>
    main()
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 22, in main
    file_count, total_size, skipped_count = download_recordings(get_users(), from_date, to_date)
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 133, in download_recordings
    user_file_count, user_total_size, user_skipped_count = download_recordings_from_meetings(meetings, user_host_folder)
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 196, in download_recordings_from_meetings
    if download_recording_file(url, host_folder, file_name, file_size, topic, recording_name):
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 222, in download_recording_file
    do_with_token(
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 259, in do_with_token
    get_with_token(lambda t: do_as_get(t))
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 83, in get_with_token
    response = get(cached_token)
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 259, in <lambda>
    get_with_token(lambda t: do_as_get(t))
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 251, in do_as_get
    do(token)
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\zoom_batch_downloader.py", line 223, in <lambda>
    lambda t: utils.download_with_progress(
  File "C:\Users\xxxx Studio 10\Documents\zoom-batch-downloader\utils.py", line 139, in download_with_progress
    raise Exception(f'Failed to download file at {url}.{"" if verbose_output else " Enable verbose output for more details."}')
Exception: Failed to download file at https://us06web.zoom.us/rec/download/FBCJehJO8flBr9[etc].
AnessZurba commented 1 year ago

Try to enable long paths in Windows. See the README for more details I'll check the date range issue.

AnessZurba commented 1 year ago

Please see the latest version I fixed the issue with date range, and I added some code that should solve the long path problem on Windows with no user intervention

mingomongo commented 1 year ago

Date range is back to normal, thanks. Still having the thing where I can't open certain videos in the folder I'm downloading them in. I've found that in fact I can open the ones not opening by putting them one folder back in the directory. Also every video that can't play/gets the size mismatch is consistent with topic name. This actually looks like a limitation of VLC Media Player since I can open the videos fine in other media players. So yeah it's some long path weirdness.

AnessZurba commented 1 year ago

Very interesting. Two questions:

Did you disable the long path limit through the registry? Could you please provide me with the path that's giving you problems?

mingomongo commented 1 year ago

I checked and it looked like long paths were already enabled (LongPathsEnable is set to 1).

A path that doesn't work for instance is (censored but still same character length): G:\Shared drives\Zen Box\Zoom Recordings\Downloads July 27 2023 Onwards\online-classical-ballet-zzzistzzz-zzztelmazzz-intadv-7-w-zzzzri-zzzzzev\online-classical-ballet-zzriszzzz-zzztelmazzz-intadv-7-w-zzzzzz-zzzzzzz2023-08-27t103135zshared_screen_with_speaker_view__04233cd6.mp4

^ I also found that changing the file name to cut the last "shared_screen_with_speaker_view04233cd6" section out let VLC play it too.

When I move the file back a folder it looks like this: G:\Shared drives\Zen Box\Zoom Recordings\Downloads July 27 2023 Onwards\online-classical-ballet-zzriszzzz-zzztelmazzz-intadv-7-w-zzzzzz-zzzzzzz2023-08-27t103135zshared_screen_with_speaker_view__04233cd6.mp4

AnessZurba commented 11 months ago

This looks like a bug in Google drive, so I'm not sure if anything can be done on my side

mingomongo commented 11 months ago

Well thanks for all your help in looking into it. I've been using it just fine for many weeks now thanks to the size mismatch allowance, so I'm satisfied 😁

AnessZurba commented 10 months ago

Glad to hear