IgnoredAmbience / yahoo-group-archiver

Scrapes and archives a Yahoo groups email archives, photo galleries and file contents using the non-public API
MIT License
93 stars 45 forks source link

Need canonical upstream repo #20

Closed d235j closed 5 years ago

d235j commented 5 years ago

Right now there are several divergent forks. This isn't the best for coordinating development.

A number of people in the #yahoosucks channel on EFnet IRC are working on this. Additionally there is a repo under @ArchiveTeam that will be used.

What should we use as the best / canonical upstream?

IgnoredAmbience commented 5 years ago

I'm happy for @archiveteam to be the canonical repository for this project, I've not really got the time to support this project.

On Mon, 21 Oct 2019, 23:58 David Ryskalczyk, notifications@github.com wrote:

Right now there are several divergent forks. This isn't the best for coordinating development.

A number of people in the #yahoosucks channel on EFnet IRC are working on this. Additionally there is a repo under @ArchiveTeam https://github.com/ArchiveTeam that will be used.

What should we use as the best / canonical upstream?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/IgnoredAmbience/yahoo-group-archiver/issues/20?email_source=notifications&email_token=AAAYB657JFDCLT4IA2F2DNLQPYXYZA5CNFSM4JDHMQ32YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HTLGCYA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAYB6YOQM3RZ4U5HQY4PETQPYXYZANCNFSM4JDHMQ3Q .

IgnoredAmbience commented 5 years ago

Slight change of mind, I plan to attempt a merge of the @archiveteam repo tomorrow (#21), and to then check the relevance of other PRs open against this repo, and content of other forks.

Likely to keep an option in for eml output mode, even if this does complicate the code.

Will stay on IRC to monitor archiveteam plans. Still happy for @archiveteam repo to become canonical (unsure best way to achieve this).

IgnoredAmbience commented 5 years ago

ArchiveTeam repo has now been merged in here, now working through bug reports on this repo.

dossy commented 5 years ago

Kinda disappointed that you merged in the removal of mbox-file style files (which can easily be turned into a maildir layout) in favor of ArchiveTeam's JSON change.

There's plenty of off-the-shelf tools for handling mbox/maildir (threaded viewers, email clients, etc.) ...

This is why open source is great: we can all have our own forks :)

d235j commented 5 years ago

@dossy we were running into encoding issues and at the moment, due to the shutdown timeline, accurate archival is higher priority than convenient viewing.

The plan is to add a separate json to mbox script. We'll definitely accept a PR giving us this!

IgnoredAmbience commented 5 years ago

As noted, intending to readd it as subsequent pass. Tbh, I should have reviewed some of the code pulled in from archive team more carefully, as it's making life harder in some places, rather than easier.

IgnoredAmbience commented 5 years ago

@dossy Is it ok if I integrate your changes into this repo?

dossy commented 5 years ago

@IgnoredAmbience I don't know how much work it'll be to integrate now that you merged in the ArchiveTeam changes, but if you have the time and inclination to do that work, go for it!