Closed Pikrass closed 4 years ago
That's excellent @Pikrass . I will update the README to mention your script, and I think that's fine enough at the moment.
My intention was to slowly query google server so the script won't be locked down; that's why I didn't have parallel support out-of-the-box. Once you have the generated script, you have a few options to continue, e.g, your way, or you can also feed the gnu parallel tool.
I mention your script here https://github.com/icy/google-group-crawler#contributions . Thanks a lot.
Hi! First of all, thank you for your script, it really helped me!
I wanted to download a 350000-message group. As you can imagine, it was taking a while. I realized the script was creating a wget process, and therefore a single connection to Google, for each and every forum page, thread page, and message. Which is really inefficient.
So I improved the generated script so that
-o $output1 $url1 -o $output2 $url2
syntaxI even made a nice text UI to monitor the jobs.
Here's my script. I license it under the same terms as your code.
I'm not doing a PR because:
Hope this proves useful. :)