Closed JimKillock closed 10 years ago
AIUI from @JimKillock this feature is as follows:
"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible."
Not sure what we mean by "double opt-in" - how would that work?
On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:
AIUI from @JimKillock this feature is as follows:
"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?
I mean “email verification” of requests for reports
A good Mailchimp explanation of double opt-in:
http://kb.mailchimp.com/article/how-does-confirmed-optin-or-double-optin-work
On 23 May 2014, at 16:25, JimKillock notifications@github.com wrote:
On 23 May 2014, at 16:07, Richard King notifications@github.com wrote:
AIUI from @JimKillock this feature is as follows:
"Can we capture requests for regular updates about a site? I think a lot of visitors will sign up for reports, so if we can do this robustly (considering security of personal data) it will gain us a lot of benefit. Also, I guess reports should be double opt-in if possible." Not sure what we mean by "double opt-in" - how would that work?
I mean “email verification” of requests for reports — Reply to this email directly or view it on GitHub.
Notes from call with @jimkillock and Pam:
Needs to have an unsubscribe option that works (preferably with a single click)!
I've read the database and API documentation.
The "users" referred to in both documents seem to be actors with credentials that let them call API endpoints, e.g. probes or the blocked.org.uk website. As far as I can tell we're not recording directly in the blocked-middleware database the fact that a "contact" (i.e. someone who has submitted URLs for us to test) has requested reports about said URLs to be sent to an email address. We do use a formsave hook to store this information as part of the "data" json object in the FormIt database though. It also seems to end up in the submission_info
column of the requests
table via additional_data
array element in the following line of the SubmitURL snippet:
$data = array('email' => $cmpUSER, 'url' => $domainToCheck, 'additional_data' => http_build_query($_POST));
(I believe this says "get the entire contents of what was POSTed and dump it into the additional_data
field, which the API then stores in the submission_info
column)
I think we need a new table to store details of "contacts" as opposed to "users". It might look like this:
Contacts Contains information about how to get in touch with an actor (who may have submitted one or more URLs for testing or be running one or more probes).
Column Name | Column Type | Purpose | Unique | Auto Inc | Required | Default value |
---|---|---|---|---|---|---|
id | int | Unique Identifier for the contact | true | true | true | N/A |
string | Contact's email address | true | false | true | Empty string | |
created | datetime | Time this record was created | false | false | true | now() |
verified | boolean | Set when the contact's email address has been verified, either by verifying a request, or by the double opt-in mechanism for the main ORG mailing list | false | false | true | false |
joinlist | boolean | Set when the contact has subscribed to ORG's mailing list | false | false | true | false |
name | string | Contact's given name (so we can address messages personally) | false | false | false | Empty string |
We can then create a link-tables for associating contacts with requests:
Requests-Contacts Links one or more requests to a contact.
Column Name | Column Type | Purpose | Unique | Auto Inc | Required | Default value |
---|---|---|---|---|---|---|
id | int | Unique Identifier for the link record | true | true | true | N/A |
request | fk | An ID from the requests table | true | false | true | N/A |
contact | fk | An ID from the contacts table | false | false | true | N/A |
salt | string | Hash this together with URL and email address to generate a request validation token | true | false | false | Empty string |
verified | boolean | Indicates that we have verified the contact made this request (double opt-in) | false | false | true | false |
subscribereports | boolean | Contact wishes to receive regular email updates about this URL | false | false | true | false |
allowcontact | boolean | Contact will accept communication from ORG about this request | false | false | false | false |
information | string | Extra info about this URL provided by the contact | false | false | false | Empty string |
This approach lets us verify that a contact wishes to receive each individual report. If we don't do this we risk exposing contacts to others signing them up for reports about any old URLs once they've verified their addresses (i.e. we just made a perfect spam tool).
Unsubscribe is also easy: just clear the subscribereports
flag.
There might be some complex SQL required for the feature that actually figures out that the blocking state of a URL has changed and then finds and emails all the contacts who have subscribed to reports about that URL. I'm open to suggestions for structural changes that would simplify this.
This structure also facilitates queries like "show me all the URLs reported by this contact" and "show me all the URLs for which this contact has subscribed to reports" for if / when we come to build the user-facing account pages.
Comments please!
FAO @dantheta, @NetworksAreMadeOfString, @mkillock
Note: we could create other link tables too, e.g.:
Users-Contacts Links one or more users to a contact.
Column Name | Column Type | Purpose | Unique | Auto Inc | Required | Default value |
---|---|---|---|---|---|---|
id | int | Unique Identifier for the link record | true | true | true | N/A |
contact | fk | An ID from the contacts table | true | false | true | N/A |
user | fk | An ID from the users table | false | false | true | N/A |
The tables can be simplified, by having a contact_id field on the users and requests tables, I think. A user can only belong to one contact (at most), and a request would be attached to a single user. The other fields on requests-contacts can also be considered properties of the request. Everything else looks cool!
Sorry for brevity, tablet typing.
I was thinking of having a summary table which shows the latest status for a URL on each ISP. That makes it very cheap to identify changes, and will also make /status/URL cheaper too. The summary table can be maintained using triggers even, so no code changes either!
@dantheta You're right about the tables. That's what comes from trying to think things like this through at 2am ;-)
The summary table is a good idea - though what would its structure look like? If you're thinking of having one column per ISP would that just get extended as new ISPs start submitting results?
It would have one row per ISP/URL combination, so the list of ISPs is open-ended. When a new result comes in, the row in the summary table is updated/replaced. The layout would be very similar to the results table itself. We might add in the last block date, just like the API's return format for /status/url. I'll post a preview this weekend.
The above pull request implements an approximate version of the database extensions I suggested. Please let me know if you think it'll work... :8ball:
By the way - do we have a preferred way of sending email to subscribers? Something like mailchimp API or another service like that, or is it cool to use a PHP mailer class and the local postfix install on the webserver?
That's one for @gwire to answer
@Dantheta MailChimp is hosted in the US, and although that doesn't technically contravene ORG's privacy policy, we're not exactly brimming with confidence in the sanctity of the Safe Harbour provisions either ;-)
An alternative might be the German company Clever Reach - I hear they have a good API and I happen to have test keys in hand (for another ORG-related project). Otherwise, ORG uses Engaging Networks' e-Activist for its supporter communications, but I understand integration options are limited (perhaps non-existent).
That's cool - I was just aiming to find out what you were already using. I'm equally happy with plain ol' SMTP too. I'm not sure what e-activists' facilities are like for sending personalised notifications to individual users.
I can write the database part of the results sender and change notifier leaving the mail transport part of it 'til last, if that would be useful.
@dantheta I've made a tentative start on code to populate the new database columns and table. Will hopefully have something for you to review before the evening is out :smile:
I have some experience with setting up postfix to be a well behaved mass mailer, if that helps. Need to do things like only send one email per 20secs to AOL and such otherwise they block the server IP for 24 hours!
Yahoo like DKIM signatures, some like SPF records and so on. If we do this, might be handy to have a separate sub domain for sending, else will need to track down all the IP addresses that use the main ORG domain, including e-activist's IPs
The API and database can now store contact details and preferences directly. Next step: change the SubmitURL snippet to take advantage.
Example here: https://github.com/openrightsgroup/Blocking-Middleware/blob/master/example-client/example-submit.php#L28-L77
I can do that later today
I've tested the snippet in the above commit on staging and it seems to work OK. @mkillock: could you take a look, and if you're happy, either copy it to the live server or let me know so I can do it?
@graphiclunarkid Looks good to me! :) I can't access the live server at the moment as I don't seem to be able to log in. Thanks for doing this even though I said I would!
I've got in, and have copied your work over to new.blocked
Argh! I was just doing the same... /me goes to check he hasn't broken anything Edit: OK, it looks fine. Thanks @mkillock!
You can view manager actions chronologically: Reports » Manager actions
Looks like Matt did the last set of additions, I can't see any for Richard.
Interesting! @mkillock must have copied the code from the staging server then, and not from the git repository, as there's a sneaky extra bug-fix I was just testing on staging which has made it over to live but isn't committed to the repo!
yup, copied from staging!
sorry, I'll step aside
It's OK Matt, I'm done. Thanks for your help :smile:
OK good! I aim to be on techvols later - 8:30 BST / 7:30 GMT, right?
@mkillock I've been away for the last couple of meetings so I don't think they've happened. I haven't heard about one tonight but I'll probably be online anyway if you wanna go over some stuff. Maybe not until about 20:30 BST though.
This thread is getting unwieldy. Since we've technically achieved allowing people to sign up for reports properly (i.e. storing it in our database in a form we can query easily) I'm going to close this issue now. In order to get the full feature we still need to implement:
Please can we move discussions of those aspects to their respective issues? Thanks.
users should be able to sign up for an email report every week / month that gives them a report on their website's status. The reports should be preferably double-opt in and of course cancellable.