Open john-peterson opened 9 years ago
I suggest
while (1)
url_dequeue
if (descend)
for (; child; child = child->next)
if child is HTML
url_enqueue_append
else
url_enqueue_prepend
k i get it. thats a better solution. i pushed it to https://github.com/mirror/wget/pull/2
BTW, since you are talking about timing, people with different bandwidths will/might experience different results when viewing such a page. I use mobile internet with 2G/3G <= 64kbit/s. I wonder how that works ;-)
low bandwidth makes this patch more important because low bandwidth increase the time between parent and child download
and most pages should support low bandwidth browsing. f.e. imagevenue probably have a long enough timeout to allow img.php to load its image with low bandwidth too
in order to add your patch to wget, could you please send the complete patch in git's format-patch format ?
What about the option to switch the behaviour + docs.
i can add that. what do u wanna call the option?
you already have that in your 'recurse' branch. Take that, undo the changes to src/recur.c and take your changes from 'recurse2' branch.
k it's done https://github.com/mirror/wget/pull/2/files
you would also have to add the relevant debug statements to ensure that when we look at a debug log, we know exactly what is happening. The current DEBUGP message will have to be changed to reflect the new working.
i changed the DEBUGP that say Enqueuing to Appending/Prepending to tell whats happening
I would like to see a test case that shows how the two differ.
there's a test in https://github.com/mirror/wget/pull/2#issue-53548650 /test that compare them
You will be required to sign the copyright assignment agreement with the FSF before we can merge this patch into our codebase.
sounds like a lot of work. i release the copyright. i dont care
Please share the patch directly with us, in git format-patch style.
its in https://github.com/mirror/wget/pull/2.patch
No, I meant a live test for the Wget Test Suite. You'll find the test suite in the testenv/ directory. We need to add a test that ensures that the Wget downloads files in the correct order as expected.
i added a test that fail unless the browser queue order is used
Please attach the patch to the email you send to this list. NOT a link to the patch. Not all of us use a browser for looking at emails.
u can download it with wget
wget https://github.com/mirror/wget/pull/2.patch
Still not going to review the patch. Turns out, I don't know how to use the command you gave me. All I do know is, if the patch were attached to the email, I'd be able to read it.
k i attached it
because it's more likely to download temporary links before they expire because it's more similar to the browsing experience
difference from browsing experience cause problem
the problem is described in #1
discarded LIFO solution
a convoluted LIFO solution with a similar result (the difference is that LIFO result in a bottom-to-top download order) is in #1
non-inline links aren't prepended
ATTR_HTML links are never prepended and might contain non-html files that aren't downloaded directly after its parent page even with browser queue type
if these cause a problem an option to read the header Content-Type of the link before enqueuing could be added
test
testenv
this check the link order
changing browser to fifo in Test--spider-r-browser.py fail the test
download order
this show the download order for the test page in https://github.com/mirror/wget/pull/1 /test
the current code sometimes download links long after its parent page (see https://github.com/mirror/wget/pull/1 /test)
this patch download links directly after its parent page