webplatform / mediawiki-conversion

Convert MediaWiki XML backup into structured raw text file tree
https://github.com/webplatform/docs
15 stars 4 forks source link

Make sure import don’t expose any user email address #14

Closed renoirb closed 9 years ago

renoirb commented 9 years ago

Initial design of this converter was to use the email address each contributor gave us when they created their account.

Problem is that we can’t ask everybody if they are OK with leaving their email address visible on each contributions they’ve made;

Also, each commit would have myself (@renoirb) as committer at the time I’ve run the import.

git show --pretty --format=fuller foo
commit foo
Author:     Renoir Boulanger <renoir@w3.org>
AuthorDate: Fri Jul 17 21:49:43 2015 +0000
Commit:     Renoir Boulanger <personal@example.org>
CommitDate: Thu Jul 23 18:21:58 2015 -0400

To solve the issue, we’ll set both Author, Commiter details to be the same value along with the Date in both fields.

As for the email address, we’ll create one that concatenates the contributor username and docs.webplatform.org (e.g. Renoirb@docs.webplatform.org).

Note that there will be no SMTP (MX) server to receive the emails for docs.webplatform.org, but it shouldn’t create a problem. What matters is that we can search who made contributions and that we can import the data with appropriate attribution.

If somebody is OK with broadcasting his email address, he can adjust his entry in the .mailmap file.

# file in webplatform/docs repository, called .mailmap
Renoir Boulanger <renoir@w3.org> Renoir Boulanger <Renoirb@docs.webplatform.org>
renoirb commented 9 years ago

Commit bfb5e95ad92767ce11507147f6614443c10501f6 will fix this issue.

renoirb commented 9 years ago

Solved!