icy / google-group-crawler

[Deprecated] Get (almost) original messages from google group archives. Your data is yours.
215 stars 38 forks source link

Only retrieved the 50 most recent posts #39

Open tjluoma opened 3 years ago

tjluoma commented 3 years ago

I did this:

export _GROUP="bbedit"
./crawler.sh -sh > curl.sh
/opt/homebrew/bin/bash curl.sh

and it saved like 4000+ files, but the 'mbox' folder only has 50 items in it, and https://groups.google.com/g/bbedit says there are 4,444 messages in the group.

Is there something else I need to do? This is a public Google group so I thought that what I did was sufficient.

Thanks!

icy commented 3 years ago

@tjluoma right that's fine. Let me try on my laptop if I can reproduce your issue. Thanks

icy commented 3 years ago

@tjluoma I have given a try, and I have 759 messages in mbox folder, and it's still counting.

$ pwd
/home/foo/projects/icy/google-group-crawler/bbedit/mbox

$ ls | wc -l
826

I'd suggest you to turn on verbose stuff as below to see how your script is working and/or they have any issues. Can you actually open the script curl.sh and modify some curl options if necessary. Please let me know if your next try is better. Thanks

$ /opt/homebrew/bin/bash -x curl.sh