wkentaro / gdown

Google Drive Public File Downloader when Curl/Wget Fails
MIT License
4.21k stars 348 forks source link

Add extra pattern to extract url from download-form returned by Google Drive for a large file #308

Closed pmeier closed 8 months ago

pmeier commented 8 months ago

Fixes #43. They were a few attempts to solve this, but even with the latest gdown==5.0.1 the download still fails. The fixes so far have concentrated on cookies, but failed to realize that the original report is for large files for which GDrive asks for user confirmation since it cannot perform a virus check.

This PR addresses this part by properly parsing the confirmation form.

wkentaro commented 8 months ago

Reply to: https://github.com/wkentaro/gdown/issues/43#issuecomment-1914210818

@pmeier I'm accessing from Japan with the default user-agent, and I get this if I put a breakpoint:

gdown https://drive.google.com/uc?id=1r6o0pSROcV1_VwT4oSjA2FBUSCWGuxLK
<form action="https://drive.google.com/uc?id=1r6o0pSROcV1_VwT4oSjA2FBUSCWGuxLK&amp;confirm=t&amp;uuid=0c79e3bf-b223-4ad3-a3f3-4ec5f7bd38cd" id="download-form" method="post">
 <input class="goog-inline-block jfk-button jfk-button-action" id="uc-download-link" type="submit" value="Download anyway"/>
</form>

FYI, my breakpoint is here:

image
pmeier commented 8 months ago

Huh, I guess there is regional difference. I'm accessing from Germany. Maybe that is the reason why so many people are still facing the issue, but you can't reproduce? Here is it for me:

screenshot from browser dev tools

image

print from breakpoint with manual formatting

<form action="https://drive.usercontent.google.com/download" id="download-form" method="get">
  <input class="goog-inline-block jfk-button jfk-button-action" id="uc-download-link" type="submit" value="Download anyway"/>
  <input name="id" type="hidden" value="1r6o0pSROcV1_VwT4oSjA2FBUSCWGuxLK"/>
  <input name="confirm" type="hidden" value="t"/>
  <input name="uuid" type="hidden" value="293715d8-4dfe-4f6a-be20-ce0d3e6308d7"/>
</form>
pmeier commented 8 months ago

Anyway, the patch should work with both variants. It still respects the query parameters that are part of the original URL.

wkentaro commented 8 months ago

Thanks @pmeier!