Allow sign up for email reports

JimKillock commented 10 years ago

users should be able to sign up for an email report every week / month that gives them a report on their website's status. The reports should be preferably double-opt in and of course cancellable.

graphiclunarkid commented 10 years ago

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible."

Not sure what we mean by "double opt-in" - how would that work?

JimKillock commented 10 years ago

On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?

I mean “email verification” of requests for reports

ei8fdb commented 10 years ago

A good Mailchimp explanation of double opt-in:

http://kb.mailchimp.com/article/how-does-confirmed-optin-or-double-optin-work

On 23 May 2014, at 16:25, JimKillock notifications@github.com wrote:

On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:

AIUI from @JimKillock this feature is as follows:

"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?

I mean “email verification” of requests for reports — Reply to this email directly or view it on GitHub.

graphiclunarkid commented 10 years ago

Notes from call with @jimkillock and Pam:

If there is no change, send an email every 3 months reminding subscribers of our existence
Send emails a maximum of once per day.
Send reports about ORG probes only - not crowdsourced reports. We should say as much in the emails too.
Send emails for blocked -> unblocked state changes and vice versa
Send emails as soon as we detect a state change for any ISP.

graphiclunarkid commented 10 years ago

Needs to have an unsubscribe option that works (preferably with a single click)!

graphiclunarkid commented 10 years ago

I've read the database and API documentation.

The "users" referred to in both documents seem to be actors with credentials that let them call API endpoints, e.g. probes or the blocked.org.uk website. As far as I can tell we're not recording directly in the blocked-middleware database the fact that a "contact" (i.e. someone who has submitted URLs for us to test) has requested reports about said URLs to be sent to an email address. We do use a formsave hook to store this information as part of the "data" json object in the FormIt database though. It also seems to end up in the submission_info column of the requests table via additional_data array element in the following line of the SubmitURL snippet:

$data = array('email' => $cmpUSER, 'url' => $domainToCheck, 'additional_data' => http_build_query($_POST));

(I believe this says "get the entire contents of what was POSTed and dump it into the additional_data field, which the API then stores in the submission_info column)

I think we need a new table to store details of "contacts" as opposed to "users". It might look like this:

Contacts Contains information about how to get in touch with an actor (who may have submitted one or more URLs for testing or be running one or more probes).

Column Name	Column Type	Purpose	Unique	Auto Inc	Required	Default value
id	int	Unique Identifier for the contact	true	true	true	N/A
email	string	Contact's email address	true	false	true	Empty string
created	datetime	Time this record was created	false	false	true	now()
verified	boolean	Set when the contact's email address has been verified, either by verifying a request, or by the double opt-in mechanism for the main ORG mailing list	false	false	true	false
joinlist	boolean	Set when the contact has subscribed to ORG's mailing list	false	false	true	false
name	string	Contact's given name (so we can address messages personally)	false	false	false	Empty string

We can then create a link-tables for associating contacts with requests:

Requests-Contacts Links one or more requests to a contact.

Column Name	Column Type	Purpose	Unique	Auto Inc	Required	Default value
id	int	Unique Identifier for the link record	true	true	true	N/A
request	fk	An ID from the requests table	true	false	true	N/A
contact	fk	An ID from the contacts table	false	false	true	N/A
salt	string	Hash this together with URL and email address to generate a request validation token	true	false	false	Empty string
verified	boolean	Indicates that we have verified the contact made this request (double opt-in)	false	false	true	false
subscribereports	boolean	Contact wishes to receive regular email updates about this URL	false	false	true	false
allowcontact	boolean	Contact will accept communication from ORG about this request	false	false	false	false
information	string	Extra info about this URL provided by the contact	false	false	false	Empty string

This approach lets us verify that a contact wishes to receive each individual report. If we don't do this we risk exposing contacts to others signing them up for reports about any old URLs once they've verified their addresses (i.e. we just made a perfect spam tool).

Unsubscribe is also easy: just clear the subscribereports flag.

There might be some complex SQL required for the feature that actually figures out that the blocking state of a URL has changed and then finds and emails all the contacts who have subscribed to reports about that URL. I'm open to suggestions for structural changes that would simplify this.

This structure also facilitates queries like "show me all the URLs reported by this contact" and "show me all the URLs for which this contact has subscribed to reports" for if / when we come to build the user-facing account pages.

Comments please!

FAO @dantheta, @NetworksAreMadeOfString, @mkillock

Note: we could create other link tables too, e.g.:

Users-Contacts Links one or more users to a contact.

Column Name	Column Type	Purpose	Unique	Auto Inc	Required	Default value
id	int	Unique Identifier for the link record	true	true	true	N/A
contact	fk	An ID from the contacts table	true	false	true	N/A
user	fk	An ID from the users table	false	false	true	N/A

dantheta commented 10 years ago

The tables can be simplified, by having a contact_id field on the users and requests tables, I think. A user can only belong to one contact (at most), and a request would be attached to a single user. The other fields on requests-contacts can also be considered properties of the request. Everything else looks cool!

Sorry for brevity, tablet typing.

dantheta commented 10 years ago

I was thinking of having a summary table which shows the latest status for a URL on each ISP. That makes it very cheap to identify changes, and will also make /status/URL cheaper too. The summary table can be maintained using triggers even, so no code changes either!

graphiclunarkid commented 10 years ago

@dantheta You're right about the tables. That's what comes from trying to think things like this through at 2am ;-)

The summary table is a good idea - though what would its structure look like? If you're thinking of having one column per ISP would that just get extended as new ISPs start submitting results?

dantheta commented 10 years ago

It would have one row per ISP/URL combination, so the list of ISPs is open-ended. When a new result comes in, the row in the summary table is updated/replaced. The layout would be very similar to the results table itself. We might add in the last block date, just like the API's return format for /status/url. I'll post a preview this weekend.

graphiclunarkid commented 10 years ago

The above pull request implements an approximate version of the database extensions I suggested. Please let me know if you think it'll work... :8ball:

dantheta commented 10 years ago

By the way - do we have a preferred way of sending email to subscribers? Something like mailchimp API or another service like that, or is it cool to use a PHP mailer class and the local postfix install on the webserver?

JimKillock commented 10 years ago

That's one for @gwire to answer

graphiclunarkid commented 10 years ago

@Dantheta MailChimp is hosted in the US, and although that doesn't technically contravene ORG's privacy policy, we're not exactly brimming with confidence in the sanctity of the Safe Harbour provisions either ;-)

An alternative might be the German company Clever Reach - I hear they have a good API and I happen to have test keys in hand (for another ORG-related project). Otherwise, ORG uses Engaging Networks' e-Activist for its supporter communications, but I understand integration options are limited (perhaps non-existent).

dantheta commented 10 years ago

That's cool - I was just aiming to find out what you were already using. I'm equally happy with plain ol' SMTP too. I'm not sure what e-activists' facilities are like for sending personalised notifications to individual users.

I can write the database part of the results sender and change notifier leaving the mail transport part of it 'til last, if that would be useful.

graphiclunarkid commented 10 years ago

@dantheta I've made a tentative start on code to populate the new database columns and table. Will hopefully have something for you to review before the evening is out :smile:

mkillock commented 10 years ago

I have some experience with setting up postfix to be a well behaved mass mailer, if that helps. Need to do things like only send one email per 20secs to AOL and such otherwise they block the server IP for 24 hours!

mkillock commented 10 years ago

Yahoo like DKIM signatures, some like SPF records and so on. If we do this, might be handy to have a separate sub domain for sending, else will need to track down all the IP addresses that use the main ORG domain, including e-activist's IPs

graphiclunarkid commented 10 years ago

The API and database can now store contact details and preferences directly. Next step: change the SubmitURL snippet to take advantage.

Example here: https://github.com/openrightsgroup/Blocking-Middleware/blob/master/example-client/example-submit.php#L28-L77

mkillock commented 10 years ago

I can do that later today

graphiclunarkid commented 10 years ago

I've tested the snippet in the above commit on staging and it seems to work OK. @mkillock: could you take a look, and if you're happy, either copy it to the live server or let me know so I can do it?

mkillock commented 10 years ago

@graphiclunarkid Looks good to me! :) I can't access the live server at the moment as I don't seem to be able to log in. Thanks for doing this even though I said I would!

mkillock commented 10 years ago

I've got in, and have copied your work over to new.blocked

graphiclunarkid commented 10 years ago

Argh! I was just doing the same... /me goes to check he hasn't broken anything Edit: OK, it looks fine. Thanks @mkillock!

JimKillock commented 10 years ago

You can view manager actions chronologically: Reports » Manager actions

Looks like Matt did the last set of additions, I can't see any for Richard.

graphiclunarkid commented 10 years ago

Interesting! @mkillock must have copied the code from the staging server then, and not from the git repository, as there's a sneaky extra bug-fix I was just testing on staging which has made it over to live but isn't committed to the repo!

mkillock commented 10 years ago

yup, copied from staging!

mkillock commented 10 years ago

sorry, I'll step aside

graphiclunarkid commented 10 years ago

It's OK Matt, I'm done. Thanks for your help :smile:

mkillock commented 10 years ago

OK good! I aim to be on techvols later - 8:30 BST / 7:30 GMT, right?

graphiclunarkid commented 10 years ago

@mkillock I've been away for the last couple of meetings so I don't think they've happened. I haven't heard about one tonight but I'll probably be online anyway if you wanna go over some stuff. Maybe not until about 20:30 BST though.

graphiclunarkid commented 10 years ago

This thread is getting unwieldy. Since we've technically achieved allowing people to sign up for reports properly (i.e. storing it in our database in a form we can query easily) I'm going to close this issue now. In order to get the full feature we still need to implement:

Results summary table (I think @dantheta may have done this now though)
Double opt-in for emails / verification of addresses: #18
A script to schedule and send out the reports: #50
Our own email server or a service that will do the sending for us: #51
Unsubscribe from email reports #52

Please can we move discussions of those aspects to their respective issues? Thanks.

openrightsgroup / cmp-issues

Allow sign up for email reports #6