Closed spacewaffle closed 4 years ago
It seems the service requires login and your cookies don't work. Let me figure it out with a local test.
I've tested with a private group and I don't see a similar problem. Maybe your cookies are expired. Could you please confirm that your cookies are still valid? Thx
BTW I have the same issue.. perhaps because both of these are groups on Google Apps (Business) accounts and not simply private groups.
@rhukster @spacewaffle It seems there is a problem with cookie file generated by browser's extension. See also https://github.com/icy/google-group-crawler/issues/24#issuecomment-375856663 . I have updated README.md accordingly.
Thanks a lot.
The problem probably was that the script couldn't detect the loop in case of invalid cookie is provided. That'd be fixed now. Moreoever, the new version is using curl with better cookie string settings instead of netscape cookie file with wget. Please try it out.
Thanks a lot.
So I've got the cookie setup and I've actually successfully used the scrapper before. I've been trying to replicate what I did before but I'm running into some issues. The scrapper will run forever with logs like below.
I eventually killed the process because it was taking multiple hours even though our google groups forum doesn't have that many posts. I found that there were thousands of thread files generated but nothing in msgs, nothing in mbox, and no db file generated after scraping. Every thread file had the same single line of text:
https://groups.google.com/forum/?_escaped_fragment_=forum/workbarbiz
Any idea what's going on here? Also not sure if this changes things but the cookie I'm using is pretty old.