datactive / bigbang

Scientific analysis of collaborative communities
http://datactive.github.io/bigbang/
MIT License
147 stars 52 forks source link

collect data from google group #280

Open Kerstiru opened 7 years ago

Kerstiru commented 7 years ago

Most 'mailing lists' I am looking into for my analysis unfortunately use google groups. See specifically alaveteli-users@googlegroups.com https://groups.google.com/forum/?utm_source=digest&utm_medium=email#!forum/alaveteli-users

and https://groups.google.com/forum/#!forum/alaveteli-dev

Is it possible to make this work?

davidberra commented 7 years ago

trying to get this to work https://gist.github.com/punchagan/7947337

would it make sense to integrate with collect_mail.py ?

(it is a browser based scraper though)

sbenthall commented 7 years ago

So far, I've been unable to find a workable scraper for Google Groups.

When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,

for example with Google Takeout https://takeout.google.com/settings/takeout

I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.

hargup commented 7 years ago

I found this script https://github.com/icy/google-group-crawler , it appears to download the message from google group but I'm not able to import them as mbox to thunderbird.

On 8 February 2017 at 23:36, Sebastian Benthall notifications@github.com wrote:

So far, I've been unable to find a workable scraper for Google Groups.

When I've studied Google Groups, I've found somebody who is a long-time member of the list who is willing to export the archive from their inbox,

for example with Google Takeout https://takeout.google.com/settings/takeout

I think if there were a good scraper, like that punchagan script, it would certainly make sense to integrate it with BigBang.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/datactive/bigbang/issues/280#issuecomment-278412114, or mute the thread https://github.com/notifications/unsubscribe-auth/ACXO3MvKzIRYDUAOIMJJPwG08oqkDvIrks5ragQLgaJpZM4L6nkL .

-- Harsh Sent from a GNU/Linux

Kerstiru commented 7 years ago

Hey guys, Thanks so much for looking into this and excuses upfront for my non-techie contributions and questions ;) I could export the mentioned group traffic with takeout. Wondering though if it makes for historical analysis of the entire group traffic or just what I have in my inbox history?

Kerstiru commented 7 years ago

I found someone who built a scraper http://saturnboy.com/2010/03/scraping-google-groups/

Is this of any use for you guys or is there anything I can concretely do with that?

Also, I just used Google Takeout BUT