Closed rockymadden closed 10 years ago
Sorry for the delayed response, I've been extremely busy lately. This looks like good work.
Why did you close the PR? Did you find something that needed to be fixed?
No worries. Nope, no bugs or anything. I instead created: https://github.com/rockymadden/interfacelift-downloader-plusplus
I wanted to do quite a few things to the codebase to improve it (including making it functional based), so much so that it will likely end up being a separate project entirely. I also wanted to setup a little project page for it.
Cool! A real fork. I look forward to seeing what you do with it.
If you find any bugs or make fixes that would be applicable upstream please let me know (or submit more PRs).
This is some nice code that I would like to see added to the base project so I'll go ahead and pull this in a little later today.
Actually my testing did reveal one problem. The async file saving, while indeed very fast, was causing a large number of simultaneous downloads to take place, especially if you're doing an initial download with no files already in the target folder. This is probably the source of the random details page redirects you were seeing.
Aside from causing the random skipping of files it is also rude to the operator of the web site. Since we are scripting the download process we want to be as polite as possible and not risk making the site less responsive to regular users.
That said, the ability to run concurrent downloads is probably beneficial, but we'll need to put a very tight limit on it. I'm going to hold off landing this until I have a solution in place. Give me just a bit longer.
Indeed, that is both con and pro of 100% async IO writes. The random details page happens without it. It is the reason I looked into the code at all.
The fork has been deleted (FYI), I'll simply start a new project when the time comes.
Cheers!
This has been merged in (d4c51276fe6c7eb875d7b68fc087d0078e8463e8). Thank you very much!
I traced the large number of concurrent connections being cause by this patch to a bug where the next
event and runNextUri()
were both being invoked on every redirect (which happens for every image). See: commit 2391737c12ec309f0296598c7b408e25c65b1c49 for details.
I've also did some other general cleanup and brought everything back in line with the current code style.
Sorry it took so long for me to get back to this.
Async file saving (MUCH faster), handled for bug where detail page is returned rather than the image upon redirect, handled for 400 and 500 status codes, added skipped and existing output messages, fixed indexing output, and other small items.
It does change quite a bit of the output, which you may not like. Just thought I'd offer up my changes if they were of interest.