backdrop / backdrop-issues

Issue tracker for Backdrop core.
145 stars 40 forks source link

Separate checking for updates from collecting anonymous data #3168

Open klonos opened 6 years ago

klonos commented 6 years ago

...this is related to #285

Describe your issue or idea

During today's dev meeting, it was proposed that we should always check for updates, but we should not have this functionality be bound with also collecting anonymous data of project versions usage.

Steps to reproduce/Actual behavior

During the installer phase, we have a checkbox that allows people to choose whether their site should check for updates AND send anonymous usage data while doing that.

Expected behavior (if reporting a bug)

jenlampton commented 6 years ago

Checking for updates should be on by default, and there needs not be a checkbox for it during installation (we would still offer a post-installation option to disable that).

Do we need this option?

We should make sure that we have up to date, detailed info of what sorts of data is being collected and for what reason(s).

I added this into our privacy policy for GDPR: https://backdropcms.org/privacy But we'll need to update it if/when any changes are made to the data we collect.

klonos commented 6 years ago

Do we need this option?

I was referring to these options (the "manually only" one in specific):

screen shot 2018-06-15 at 7 11 01 pm

If we remove them, then it would be a feature regression. Right?

jenlampton commented 6 years ago

Is there a use-case for not checking for updates? It seems like this is a feature nobody needs.

klonos commented 6 years ago

I can only think of unwanted network traffic/requests ATM.

klonos commented 6 years ago

...although this guy here started the thread because he was having issues with slow page loading, this is what sticks out of his comment for me:

Every other program in the world lets you turn them off automatic updates, so I expect that there should be a way to do the same for WordPress, but if there is, it is a little too-wellโ€”hidden.

I guess that people just expect this to be an option, and an easily-findable one too. Perhaps at some point in the future this won't be an issue, with the new generation of people that are "trained" to the whole "we know better than the end user" philosophy (read MS Win 10 autoupdates, Chrome/Firefox autoupdates, Apple/Steve Jobs etc).

Here's another story: https://bugzilla.redhat.com/show_bug.cgi?id=1517758

I guess with Backdrop, it comes down to these points:

a) is exposing this through the admin UI an 80/20 case? b) will removing this feature "upset" current/future product users?

If we are to implement automatic (security) updates (#2018), we will need a corner somewhere in the admin UI to shove the respective settings for it (including enabling, disabling, frequency, and level of automation). I feel that if we were to remove these settings now, we would need to re-introduce them (or something similar) in the near future. So why bother now and waste energy/time on something that will need cleanup/overhaul soon(ish)?

jenlampton commented 6 years ago

...although this guy here started the thread because he was having issues with slow page loading

Since Backdrop will be doing it's update checking as part of cron, and cron uses background-fetch, it shouldn't affect page load speed at all.

I guess that people just expect this to be an option, and an easily-findable one too.

Checking update data and actually performing automatic updates are completely different things. One just gets you information and is almost completely harmless, and the other has the potential of completely FOOBARing your site :)

Question: what should we call the new core module that hanldes reporting your site's data back to backdropcmsorg? Ideas from today's meeting:

klonos commented 6 years ago

...although not active, the "metrics" namespace is already taken: https://www.drupal.org/project/metrics

olafgrabienski commented 6 years ago

"Metrics" may be difficult to translate in some languages, but actually I don't understand the meaning of "Metrics" very well. Is it related to https://en.wikipedia.org/wiki/Software_metric?

klonos commented 6 years ago

@olafgrabienski I would say that Wikipedia is right, but I prefer google's definition: https://www.google.com.au/search?q=define+metrics

technical a system or standard of measurement.

My personal understanding is this: "a set of data/numbers that help you measure something" ..."metrao" in modern Greek literally means "to count", and I think that "metrics" is derived from that ๐Ÿ˜„

olafgrabienski commented 6 years ago

@klonos I see! The German translation of "metrics" according to Google Translate is "Metrik" but in German "Metrik" is better known and mostly used to describe the rhythm of poems (metrical feet like dactyl, or iamb). This may be a particular issue in German, and I guess we'll find however a good translation.

klonos commented 6 years ago

...that was the original usage of the word in ancient Greek too ๐Ÿ˜„...from the same google definition page:

screen shot 2018-06-25 at 5 36 48 am

...which leads to:

screen shot 2018-06-25 at 5 39 40 am
jenlampton commented 5 years ago

Update: we have decided to call the other module "Telemetry" and there is a separate issue for that, over here: https://github.com/backdrop/backdrop-issues/issues/285

For this issue, I'm having second thoughts about doing it at all.

During today's dev meeting, it was proposed that we should always check for updates, but we should not have this functionality be bound with also collecting anonymous data of project versions usage.

Why? Have we even had any requests to stop sending anonymized data?

I think the only real issue here is that we are sending an IP address so that we can tell all the sites apart. Why don't we just hash the IP address so that no recognizable data is ever stored? ~See https://github.com/backdrop/backdrop-issues/issues/3688~ See backdrop-contrib/project#39

klonos commented 5 years ago

Sorry @jenlampton, I missed your last comment. Yes, I agree that we should do #3688, but we should also perhaps make the data-sending opt-out at the very least.

The ethical thing to do is to at least provide a way for people to disable it if they have any concerns (does not have to be during the installation process).

jenlampton commented 3 years ago

There isn't any way for us to stop sending data if we don't also stop the update checking (like we support currently) but we could remove any data that might be identifiable.

If, however, we never sent anything that was identifiable in the first place (which we should be doing anyway, see https://github.com/backdrop-contrib/project/issues/39) then there would be nothing left for us to do here.

Rather than providing people with an option for us to do the right thing, we should just always do the right thing :)

quicksketch commented 3 years ago

I'm going to take a swipe at this. @klonos and I have discussed it separate and again this week in the weekly meeting: https://www.youtube.com/watch?v=4iI44ooJGzA

The scope of the changes will be:

Note that we also stopped storing IP addresses on BackdropCMS.org as part of the latest changes to Project module (the https://github.com/backdrop-contrib/project/issues/39 issue @jenlampton linked to above).

As for why we would want to provide this option at all: I think some sites might want to specifically exclude being tracked just to prevent throwing off usage statistics, such as large multisite installations or development environments. We may actually want to use this setting ourselves on the BackdropCMS.org demo sandboxes.

jenlampton commented 3 years ago

This will hurt adoption.

We already have a really hard time convincing people that Backdrop has large enough usage to warrant giving it a try. They are used to seeing Drupal's numbers that do count automated testing and dev sites and everything we've already removed from backdrops usage tracking. Drupal doesn't provide this option, so by adding it we are making our apples appear even smaller.

I think some sites might

I really don't like adding a feature for a theoretical use-case. We have an 80% rule. This feature would not qualify.

We've already solved the actual problem in backdrop-contrib/project#39, so my vote would be to close this issue and move on to other things that we know won't hurt us, and will provide a benefit to actual people.

stpaultim commented 3 years ago

If I understand this issue correctly, the goal is to make it possible for sites to check for available module updates without providing any anonymous data about what modules they are using.

On the surface, this seems like a good thing to do. But, I share the concerns raised by @jenlampton

1) Is this really a problem that needs to be solved. 2) Are we making it too easy for users to not share any usage data (I'm not really hearing from anyone that the collection of this anonymous data is a problem). 3) Will this effect our usage statistics in a detrimental way - without any real gain.

I'm not ready to vote against this feature. But, I do have reservations about the cost/benefit/damage equation and how it might impact the usefulness of our statistics/metrics.

I wish I had a better sense of how accurate our data is now and how this might impact it.

quicksketch commented 3 years ago

PR https://github.com/backdrop/backdrop/pull/3730 is up with the following UI changes:

Checkbox added at admin/reports/updates/settings: image

Installer screenshot before changes: image

Installer section after changes:

Previously this setting disabled checking for updates on cron, but Update module was already enabled anyway. Now this option enables Telemetry module and determines if update checking sends the site_key (and thus tracking), but it always enables daily update checking.

Regarding some of Tim's points:

Is this really a problem that needs to be solved.

I think checking for updates shouldn't require being tracked, whether the data is anonymous or not. Like I said above, there are other reasons besides privacy why you might not want a site to be tracked.

Are we making it too easy for users to not share any usage data (I'm not really hearing from anyone that the collection of this anonymous data is a problem).

I don't think so. There is already a checkbox on the installer to disable update checking and users already turn off Update module when they don't want any checking at all.

Will this effect our usage statistics in a detrimental way - without any real gain.

I think the language of the installer is probably the thing that is going to have the biggest impact whether users enable this option or not. "Check for updates automatically" sounds less ominous than "Send anonymous usage information".

jenlampton commented 3 years ago

Like I said above, there are other reasons besides privacy why you might not want a site to be tracked.

Other theoretical reasons. Nothing anyone has ever asked for, or mentioned needing.

I think the language of the installer is probably the thing that is going to have the biggest impact whether users enable this option or not. "Check for updates automatically" sounds less ominous than "Send anonymous usage information".

I agree. So we're intentionally making it worse with this PR?

quicksketch commented 3 years ago

So we're intentionally making it worse with this PR?

We're tracking more information with that same option, so yes I think it is intentional.

I guess my concern hinges around having one option for all data collection. It's difficult to describe having two checkboxes for different types of data collection. If we ask users if it's okay to send usage information, I think that should be comprehensive of all usage data. It may be complicated to describe how Telemetry data differs from Update usage data, especially if we start tracking installed core modules with Telemetry.

klonos commented 3 years ago

I understand the potential implications, but I'm still in favor of this change. Providing people with the option to completely opt out of metrics and usage data is the ethical thing to be doing.

We could add a big piece of text in https://backdropcms.org/project/usage and any https://backdropcms.org/project/usage/% page, that says something along the lines of:

Backdrop allows sites to completely disable sending anonymous usage data, so the actual usage numbers are most likely higher than what they appear on this table. For more information, see [link to the blog post where we announce that we have allowed that kind of freedom]()

bugfolder commented 3 years ago

Providing people with the option to completely opt out of metrics and usage data is the ethical thing to be doing.

Agreed. And it should be easy and obvious.

jenlampton commented 3 years ago

Okay, if that's the case then we definitely shouldn't be adding this to the installer, and this issue should be tabled until we can come up with a viable alternative.

The installer currently offers people the option do something they want "Do you want to check for updates?" which is nice and friendly and inviting -- and is harmless.

This PR is proposing replacing that offer of a feature with something that instead says "WE ARE DOING SOMETHING SCARY AND DANGEROUS AND YOU SHOUOD TURN IT OFF". That's not what we want as a first impression for Backdrop. (This also means telemetry should NOT be enabled by default yet)

There are contrib modules for Drupal that handle disabling of update checking / usage tracking (which is NOT anonymous). Since this is a < 80% use-case, perhaps the most sensible alternative would be to kick this whole problem to contrib?

edit: see https://www.drupal.org/project/update_notifications_disable edit: see also https://www.drupal.org/project/update_advanced

klonos commented 3 years ago

The installer currently offers people the option do something they want "Do you want to check for updates?" which is nice and friendly and inviting -- and is harmless.

The idea is to have checking for updates always enabled - without asking people about it, which would basically remove the checkbox from the installer. Till now, we were not able to do this, since the current "Check for updates automatically" checkbox is actually doing 2 things:

Also that single checkbox, as well as the functionality behind it seems like a "binding" thing, which shouldn't be the case for an important thing such as checking for (security) updates. We (Drupal and Backdrop) are basically indirectly telling people that "if you want to know whether there are security updates for your site, you HAVE to give us (some of) your data - there is no way out of it".

WE ARE DOING SOMETHING SCARY AND DANGEROUS AND YOU SHOUOD TURN IT OFF

Then it sounds that we should fix that. Lets add text that is friendly/inviting, explanatory, and thankful:

jenlampton commented 3 years ago

Lets add text that is friendly/inviting, explanatory, and thankful:

This new option is much, much better. I also think we can add this checkbox to the existing form without changing anything else if we wanted to enable telemetry for new installs. (But should go into a follow-up issue for telemetry)

The idea is to have checking for updates always enabled - without asking people about it, which would basically remove the checkbox from the installer.

We probably should have been doing this since Backdrop 1.0.0 anyway. It was recommended way back in https://github.com/backdrop/backdrop-issues/issues/467 (But that is also probably a separate issue?)

Till now, we were not able to do this

We've been talking about removing this checkbox from the installer (without objection) since 2014. If we want to remove it, we certainly can. Removing the checkbox from the installer is unrelated this update/tracking issue.

I suspect you've drawn the correlation between the two issues because you already know how things work, but there's no way anyone new to Backdrop would know one thing had anything to do with the other -- certainly not from the install wizard (which admittedly is a problem). By the time the 1% of people who care find out, they'll use the modules page to opt out, or (in the case of multi-site) they already install in different ways where they wouldn't encounter the checkbox anyway.

sending tracking/usage data, which we are not telling people about ๐Ÿ‘Ž๐Ÿผ (not ethical)

We've resolved the main ethics issue by making the data anonymous. The only remaining problem here is one of documentation. People need to know how things work, yes, but a message on the status report is probably enough to accomplish that.

If we are going to add support for a "tinfoil hat"* edge-case to core -- especially one that has negative consequences for the whole backdrop community if people were to use it -- we should certainly put the option someplace where only those who actually need it will find it. If you show this option to someone who doesn't understand how it works (like on the installer) their instinct might lead them in the wrong direction, and that would be bad for everyone.

klonos commented 3 years ago

OK, so here's what I think should happen then: