blanchardjeremy / google-groups-php-api

Provides access to Google Groups using a faux-API (emulating a browser to take actions as if it were a user).
http://activismlabs.org
35 stars 9 forks source link

Feature request: support export of archive content #1

Open nxg opened 13 years ago

nxg commented 13 years ago

It's the obvious feature request:

If you were able to provide a way of exporting the content of a google group -- just the dump of all the messages, nothing fancy -- I'm sure you would be a very popular person, worldwide!

(why is it so hard...!?)

Norman

blanchardjeremy commented 13 years ago

Thanks for the request!

I think this would be a bit harder to pull off. Or would have to be used with caution. It would basically pound their servers to death if you were exporting tons of messages because every message would involve a separate page-load (or 2 or 3 if there were multiple pages for a given thread).

What formats would you like to see it in? Which details do you need about each post? Do you want the posts threaded the way google groups threads them?

nxg commented 13 years ago

I would imagine this being used for occasional archiving dumps of groups, as a slightly paranoid backup, perhaps; or because a group has served its purpose and is being shut down; or because one wants to move a mailing list to a different service, and transfer the history from Google Groups.

For this sort of occasional use, it would be OK to throttle the process, retrieving only a message per second, or every few seconds.

This is the sort of case that I'd imagine being handled by dataliberation.org, but there's no mention of Groups on the list of Google products there. If they have this sort of feature on their roadmap, that would be ideal, but they don't publish a roadmap (intelligibly).

blanchardjeremy commented 13 years ago

dataliberation.org looks awesome. Thanks for that reference.

What format should the data be exported in? RSS? Atom? mbox (I'm not familiar with it)?

Does anyone know what format is popular for this kind of export?

nxg commented 13 years ago

Any format would work. Atom or RSS would be nifty, but plain old mbox is the no-frills format into which I'd probably convert a feed for archiving.

mbox http://en.wikipedia.org/wiki/Mbox is a semi-standard. It's what mailers usually write out if they're asked to 'save raw email message' or something like that.

blanchardjeremy commented 13 years ago

Hmm. Okay. I'd also to investigate storing the threading of messages rather than just the flat messages. :)

blanchardjeremy commented 13 years ago

Remove util.php requirement in basic_tests. Closed by 2f2d02962829dfe1b356b8813f82a0a93188cdea.

blanchardjeremy commented 13 years ago

Oops. didn't mean to close this. Sorry!