openrightsgroup / cmp-issues

Centralised issue-tracking for the Blocked backend
2 stars 0 forks source link

Allow sign up for email reports #6

Closed JimKillock closed 10 years ago

JimKillock commented 10 years ago

users should be able to sign up for an email report every week / month that gives them a report on their website's status. The reports should be preferably double-opt in and of course cancellable.

graphiclunarkid commented 10 years ago

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible."

Not sure what we mean by "double opt-in" - how would that work?

JimKillock commented 10 years ago

On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?

I mean “email verification” of requests for reports

ei8fdb commented 10 years ago

A good Mailchimp explanation of double opt-in:

http://kb.mailchimp.com/article/how-does-confirmed-optin-or-double-optin-work

On 23 May 2014, at 16:25, JimKillock notifications@github.com wrote:

On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?

I mean “email verification” of requests for reports — Reply to this email directly or view it on GitHub.

graphiclunarkid commented 10 years ago

Notes from call with @jimkillock and Pam:

graphiclunarkid commented 10 years ago

Needs to have an unsubscribe option that works (preferably with a single click)!

graphiclunarkid commented 10 years ago

I've read the database and API documentation.

The "users" referred to in both documents seem to be actors with credentials that let them call API endpoints, e.g. probes or the blocked.org.uk website. As far as I can tell we're not recording directly in the blocked-middleware database the fact that a "contact" (i.e. someone who has submitted URLs for us to test) has requested reports about said URLs to be sent to an email address. We do use a formsave hook to store this information as part of the "data" json object in the FormIt database though. It also seems to end up in the submission_info column of the requests table via additional_data array element in the following line of the SubmitURL snippet:

$data = array('email' => $cmpUSER, 'url' => $domainToCheck, 'additional_data' => http_build_query($_POST));

(I believe this says "get the entire contents of what was POSTed and dump it into the additional_data field, which the API then stores in the submission_info column)

I think we need a new table to store details of "contacts" as opposed to "users". It might look like this:

Contacts Contains information about how to get in touch with an actor (who may have submitted one or more URLs for testing or be running one or more probes).

Column Name Column Type Purpose Unique Auto Inc Required Default value
id int Unique Identifier for the contact true true true N/A
email string Contact's email address true false true Empty string
created datetime Time this record was created false false true now()
verified boolean Set when the contact's email address has been verified, either by verifying a request, or by the double opt-in mechanism for the main ORG mailing list false false true false
joinlist boolean Set when the contact has subscribed to ORG's mailing list false false true false
name string Contact's given name (so we can address messages personally) false false false Empty string

We can then create a link-tables for associating contacts with requests:

Requests-Contacts Links one or more requests to a contact.

Column Name Column Type Purpose Unique Auto Inc Required Default value
id int Unique Identifier for the link record true true true N/A
request fk An ID from the requests table true false true N/A
contact fk An ID from the contacts table false false true N/A
salt string Hash this together with URL and email address to generate a request validation token true false false Empty string
verified boolean Indicates that we have verified the contact made this request (double opt-in) false false true false
subscribereports boolean Contact wishes to receive regular email updates about this URL false false true false
allowcontact boolean Contact will accept communication from ORG about this request false false false false
information string Extra info about this URL provided by the contact false false false Empty string

This approach lets us verify that a contact wishes to receive each individual report. If we don't do this we risk exposing contacts to others signing them up for reports about any old URLs once they've verified their addresses (i.e. we just made a perfect spam tool).

Unsubscribe is also easy: just clear the subscribereports flag.

There might be some complex SQL required for the feature that actually figures out that the blocking state of a URL has changed and then finds and emails all the contacts who have subscribed to reports about that URL. I'm open to suggestions for structural changes that would simplify this.

This structure also facilitates queries like "show me all the URLs reported by this contact" and "show me all the URLs for which this contact has subscribed to reports" for if / when we come to build the user-facing account pages.

Comments please!

FAO @dantheta, @NetworksAreMadeOfString, @mkillock

Note: we could create other link tables too, e.g.:

Users-Contacts Links one or more users to a contact.

Column Name Column Type Purpose Unique Auto Inc Required Default value
id int Unique Identifier for the link record true true true N/A
contact fk An ID from the contacts table true false true N/A
user fk An ID from the users table false false true N/A
dantheta commented 10 years ago

The tables can be simplified, by having a contact_id field on the users and requests tables, I think. A user can only belong to one contact (at most), and a request would be attached to a single user. The other fields on requests-contacts can also be considered properties of the request. Everything else looks cool!

Sorry for brevity, tablet typing.

dantheta commented 10 years ago

I was thinking of having a summary table which shows the latest status for a URL on each ISP. That makes it very cheap to identify changes, and will also make /status/URL cheaper too. The summary table can be maintained using triggers even, so no code changes either!

graphiclunarkid commented 10 years ago

@dantheta You're right about the tables. That's what comes from trying to think things like this through at 2am ;-)

The summary table is a good idea - though what would its structure look like? If you're thinking of having one column per ISP would that just get extended as new ISPs start submitting results?

dantheta commented 10 years ago

It would have one row per ISP/URL combination, so the list of ISPs is open-ended. When a new result comes in, the row in the summary table is updated/replaced. The layout would be very similar to the results table itself. We might add in the last block date, just like the API's return format for /status/url. I'll post a preview this weekend.

graphiclunarkid commented 10 years ago

The above pull request implements an approximate version of the database extensions I suggested. Please let me know if you think it'll work... :8ball:

dantheta commented 10 years ago

By the way - do we have a preferred way of sending email to subscribers? Something like mailchimp API or another service like that, or is it cool to use a PHP mailer class and the local postfix install on the webserver?

JimKillock commented 10 years ago

That's one for @gwire to answer

graphiclunarkid commented 10 years ago

@Dantheta MailChimp is hosted in the US, and although that doesn't technically contravene ORG's privacy policy, we're not exactly brimming with confidence in the sanctity of the Safe Harbour provisions either ;-)

An alternative might be the German company Clever Reach - I hear they have a good API and I happen to have test keys in hand (for another ORG-related project). Otherwise, ORG uses Engaging Networks' e-Activist for its supporter communications, but I understand integration options are limited (perhaps non-existent).

dantheta commented 10 years ago

That's cool - I was just aiming to find out what you were already using. I'm equally happy with plain ol' SMTP too. I'm not sure what e-activists' facilities are like for sending personalised notifications to individual users.

I can write the database part of the results sender and change notifier leaving the mail transport part of it 'til last, if that would be useful.

graphiclunarkid commented 10 years ago

@dantheta I've made a tentative start on code to populate the new database columns and table. Will hopefully have something for you to review before the evening is out :smile:

mkillock commented 10 years ago

I have some experience with setting up postfix to be a well behaved mass mailer, if that helps. Need to do things like only send one email per 20secs to AOL and such otherwise they block the server IP for 24 hours!

mkillock commented 10 years ago

Yahoo like DKIM signatures, some like SPF records and so on. If we do this, might be handy to have a separate sub domain for sending, else will need to track down all the IP addresses that use the main ORG domain, including e-activist's IPs

graphiclunarkid commented 10 years ago

The API and database can now store contact details and preferences directly. Next step: change the SubmitURL snippet to take advantage.

Example here: https://github.com/openrightsgroup/Blocking-Middleware/blob/master/example-client/example-submit.php#L28-L77

mkillock commented 10 years ago

I can do that later today

graphiclunarkid commented 10 years ago

I've tested the snippet in the above commit on staging and it seems to work OK. @mkillock: could you take a look, and if you're happy, either copy it to the live server or let me know so I can do it?

mkillock commented 10 years ago

@graphiclunarkid Looks good to me! :) I can't access the live server at the moment as I don't seem to be able to log in. Thanks for doing this even though I said I would!

mkillock commented 10 years ago

I've got in, and have copied your work over to new.blocked

graphiclunarkid commented 10 years ago

Argh! I was just doing the same... /me goes to check he hasn't broken anything Edit: OK, it looks fine. Thanks @mkillock!

JimKillock commented 10 years ago

You can view manager actions chronologically: Reports » Manager actions

Looks like Matt did the last set of additions, I can't see any for Richard.

graphiclunarkid commented 10 years ago

Interesting! @mkillock must have copied the code from the staging server then, and not from the git repository, as there's a sneaky extra bug-fix I was just testing on staging which has made it over to live but isn't committed to the repo!

mkillock commented 10 years ago

yup, copied from staging!

mkillock commented 10 years ago

sorry, I'll step aside

graphiclunarkid commented 10 years ago

It's OK Matt, I'm done. Thanks for your help :smile:

mkillock commented 10 years ago

OK good! I aim to be on techvols later - 8:30 BST / 7:30 GMT, right?

graphiclunarkid commented 10 years ago

@mkillock I've been away for the last couple of meetings so I don't think they've happened. I haven't heard about one tonight but I'll probably be online anyway if you wanna go over some stuff. Maybe not until about 20:30 BST though.

graphiclunarkid commented 10 years ago

This thread is getting unwieldy. Since we've technically achieved allowing people to sign up for reports properly (i.e. storing it in our database in a form we can query easily) I'm going to close this issue now. In order to get the full feature we still need to implement:

Please can we move discussions of those aspects to their respective issues? Thanks.