TheSpaghettiDetective / OctoPrint-Obico

GNU Affero General Public License v3.0
137 stars 41 forks source link

Privacy: Loads trackers and ads from google #51

Closed disconn3ct closed 4 years ago

disconn3ct commented 4 years ago

Related to #30 , it is not about the quality of the content (which is quite good) but the fact that embedding it means all of those trackers are loaded and run every time you open Octoprint. This is very effective usage tracking which is explicitly NOT anonymous and cannot be opted out of.

As a simple fix, using links to those pages instead of embedding the content will not only simplify the UI (making it faster and less memory-intensive) but also remove the tracking.

kennethjiang commented 4 years ago

First, they are not "trackers" or "ads". What #30 referred to were the links to TSD's Github, Twitter and Facebook accounts. Please show some evidence that they really "track" your browsing history or showing you ads before calling them "trackers" or "ads".

Second, those links have been removed from the plugin long time ago, since they were far at the bottom of the page and not being seen by users anyway. The only link we have on the plugin in page is the link to TSD video, which is really just to help users understand what TSD is.

Third, we are actually considering showing ads to Free users as one of the ways to finance the servers we have to run to support these free users. For anyone who cares more about not seeing any ads than about having TSD maintained and improved on a long term basis, please uninstall this plugin.

I understand privacy - nobody wants to be tracked or shown ads. Myself included. So we have made sure we don't do anything that is beyond any legal or moral boundary. But I also understand a lot of wonderful products (such as Google itself) can't exist with some level of "tracking" or "ads".

Please provide facts and use common sense when you are making allegations like this.

I will leave this issue open for 2 more days so that you have a chance to response.

disconn3ct commented 4 years ago

When you embed that video in the UI, it is loaded when the UI is loaded. This tells google that your account, on this specific browser, loaded the octoprint page. That sounds exactly like a usage tracker. On top of that, the video iframe makes a ton of requests to doubleclick.net and google ads. (FWIW this was observed on a private host, so there is no server you have to run.)

I provided facts above, and can continue to do so. The simple fact is I caught this because it is the ONLY plugin I have that triggers all those external requests and attempts to load ads. (I didn't have to play in safe mode; the requests stopped as soon as it was removed. Similarly, running the version from my fork with the youtube div cut out stopped it too.)

I don't feel that your hostility is warranted or appropriate in this discussion, but it is your community and you can shape it how you like.

kennethjiang commented 4 years ago

I tried to stick to what I think as facts. If they came across as hostility, my apology for that!

I embedded the YouTube the way they recommended. Since they are in iframe, I believe whatever being sent to doubleclick.net and google ads should be just the fact you viewed a youtube video, not anything outside the iframe.

I spent tens of hours to create this video to help users understand TSD better. Embedding a YouTube video is the best way I can think of, and something millions of people do without problems (at least not that I'm aware of). If you have a better way to deliver this video to users without what makes you uncomfortable, please send a PR and I'll merge it. But unfortunately I won't remove this video just because a few users don't like how embedded YouTube videos work.

disconn3ct commented 4 years ago

If you read it as

the fact you viewed a [page containing a specific] youtube video [only embedded by a specific app in very specific ways]

it becomes a bit more clear. Don't forget that google's defaults are intentionally in service of collecting and monetizing user data. The default video embedding is no different.

As an alternative, consider nocookie: https://www.ghacks.net/2018/05/23/why-you-should-always-use-youtubes-privacy-enhanced-mode/

"When enabled, YouTube won't store information about visitors to pages on your site that have YouTube videos embedded on them unless visitors interact with those videos."

kennethjiang commented 4 years ago

I know I can't convince you. Unfortunately you didn't convince me why embedding YouTube video is a privacy problem and we will have to spend very limited engineering resource to solve this "problem".

If you intend to make a PR as I suggested earlier, we will thank you for that and happily merge it. Otherwise, I'll see this as an impasse and close this issue.

Please let me know.

jneilliii commented 4 years ago

As an alternative, consider nocookie:

@disconn3ct so something like this is what you propose?

https://github.com/jneilliii/OctoPrint-TheSpaghettiDetective/tree/patch-1

If so @kennethjiang could easily pull that change in for Privacy-Enhanced Mode and prevent the YouTube Video from automatically binding and loading unless you click the link to show the video.

foosel commented 4 years ago

@kennethjiang please consider merging this or even just switching to youtube-nocookie. I'd prefer to not having to police plugins due to stuff like this, but anything that could track your users without their consent is bad news. See also this snippet from the registration guidelines:

image

That also holds true for anything like sentry and similar stuff btw. I've so far been very lenient in that regard but if I keep getting reports about stuff like this I'll have to put the foot down at some point, especially if it also impacts loading times as claimed in #438.

kennethjiang commented 4 years ago

@foosel This is just an iframe linked to a video hosted on YouTube. I understand there are people who think of anything related to Google as "tracking" but I don't see how this is "tracking" from technical point of view (but I could be wrong).

Every time when a user opened a ticket like this, I tried to engage them in a technical discussion and explain the only thing I want to achieve here is so that new users can view the video and see how TSD works (because it's a hard concept for many users). And I'm always open to a PR that achieves this (hopefully very benign) goal.

@jneilliii Do you want to send a PR for the change you made? The link you gave doesn't seem like a patch. Thanks!

jneilliii commented 4 years ago

PR submitted from that patch branch. You'll still need to bump the version in setup.py and make an official release with the new version.

kennethjiang commented 4 years ago

Cool thanks @jneilliii !

foosel commented 4 years ago

the only thing I want to achieve here is so that new users can view the video and see how TSD works (because it's a hard concept for many users)

That is perfectly valid, and in no way limited by switching from youtube.com to youtube-nocookie.com. Not wanting to be tracked through youtube cookies (what videos are being watched when in what browser and so on) is a totally valid concern, and expectation from users running a piece of software on their own LAN.

Respect your user's privacy is all I'm saying, especially if it's as trivially done as slightly changing the youtube embed.

On that note also please add an opt-in for the (error) tracking you perform via sentry in your plugin, as required by the plugin repository guidelines. Yes, I know that is annoying and less useful this way to you as the developer, but not every user would like to have their usage data sent somewhere, even anonymously and even if only in case of an error. Privacy first. Thank you.

kennethjiang commented 4 years ago

Didn't know it was just a domain switch!

I knew you'd bring up Sentry. :)

TSD plugin is by definition sending data, including webcam feed, to our backend server. And this is why we made our TOS and privacy links front and center in the wizard and settings page (screenshots attached).

My understanding is this will make sentry opt-in redundant, since sentry data is anonymized and hence less sensitive than other data the plugin sends.

My biggest worry with making sentry opt-in is, as you are probably also aware, only a tiny fraction of users will opt it in. I'm relying on sentry to give me a pulse on the healthiness of the end-to-end system (bugs introduced in new version, an backend API is broken, etc). Since TSD doesn't have as many users as OctoPrint does, the users who opt in sentry will unlikely have the critical mass that gives me the pulse any more... :(

Screen Shot 2020-04-20 at 12 36 30 PM Screen Shot 2020-04-20 at 12 39 20 PM
foosel commented 4 years ago

My biggest worry with making sentry opt-in is, as you are probably also aware, only a tiny fraction of users will opt it in. I'm relying on sentry to give me a pulse on the healthiness of the end-to-end system (bugs introduced in new version, an backend API is broken, etc). Since TSD doesn't have as many users as OctoPrint does, the users who opt in sentry will unlikely have the critical mass that gives me the pulse any more... :(

I understand this issue, I really do - I face it myself as you know. Most of OctoPrint's lifetime I was flying completely blind, and it was infuriating. What I can tell you from my own experience is that people are surprisingly cooperative with regards to getting their usage and such tracked if you are straightforward about why this data helps development, how you are collecting it, what you are collecting, and that they can opt out again at any point in time (see tracking.octoprint.org). Even more so if you also share your insights from the data with them. And yes, you will never get the agreement from all users. But that should actually give you the hint that if given the choice most users don't want to be tracked. It's not okay to collect people's data without their knowledge to make your life as a developer easier, and anything but an opt-in boils down to "without their knowledge". It would have been absolutely trivial for me to build usage tracking in OctoPrint in a covert way without anyone noticing. It would have saved me a lot of trouble to do an opt-out instead of an opt-in. But being very open about it and doing it with opt-in is the bloody right thing to do, especially in this day and age where every other app takes the easy path and just sells their users' souls to the tracking industry, even if it means more work to me and a smaller data base to boot.

OctoPrint as a platform is supposed to be different here. So, I'm sorry, but I must insist: Sentry needs to be made opt-in. Currently it isn't even opt-out apart from uninstalling your plugin, and noting that the plugin is tracking the user just when they install it and don't even set it up with a note somewhere in a long privacy policy isn't adequate. And to be honest, I should have addressed this much sooner, but I was buried in other stuff (as usual). This youtube situation just now reminded me about it. And yes, when other plugins that do tracking without opt-in come to my attention, they'll get the exact same answer.

That being said, if you want to monitor that the whole end-to-end system works, your really should look into proper service monitoring that doesn't require your users to opt-in into sentry. There's a myriad of good tools for achieving that out there. Personally I run a mix of netdata, uptimerobot and all that hooked into a telegram bot to keep an eye on OctoPrint's infrastructure.

kennethjiang commented 4 years ago

Thank you so much @foosel for explaining to me why you chose to do it the hard way. I think I'm convinced that in the era of many companies trading user's privacy for money, it's probably a good idea to go an extra mile in transparency.

I'll make Sentry an opt-in in the next release, which is probably in a couple weeks. Just opened an issue here at: https://github.com/TheSpaghettiDetective/OctoPrint-TheSpaghettiDetective/issues/58. Let me know if you want me to make a separate patch sooner than that.

I do use backend monitoring tools but they are good at monitoring up/down status and server performance, not at catching bugs. An alternative to sentry is to build a very comprehensive regression testing suite, which I could probably pull off if I had 72 hours a day. But since you were flying blind and still managed to build OctoPrint to today's quality, I think I am not completely hopeless here. ;)

BTW I'm using statuscake but it looks like uptimerobot is better. Will switch to that!