Closed onitake closed 6 years ago
Opt-in patch, still needs to be translated:
diff --git a/plugins/SliceInfoPlugin/SliceInfo.py b/plugins/SliceInfoPlugin/SliceInfo.py
index 0514c4da..2c3d6ba3 100755
--- a/plugins/SliceInfoPlugin/SliceInfo.py
+++ b/plugins/SliceInfoPlugin/SliceInfo.py
@@ -36,11 +36,11 @@ class SliceInfo(Extension):
def __init__(self):
super().__init__()
Application.getInstance().getOutputDeviceManager().writeStarted.connect(self._onWriteStarted)
- Preferences.getInstance().addPreference("info/send_slice_info", True)
+ Preferences.getInstance().addPreference("info/send_slice_info", False)
Preferences.getInstance().addPreference("info/asked_send_slice_info", False)
if not Preferences.getInstance().getValue("info/asked_send_slice_info"):
- self.send_slice_info_message = Message(catalog.i18nc("@info", "Cura collects anonymised slicing statistics. You can disable this in the preferences."),
+ self.send_slice_info_message = Message(catalog.i18nc("@info", "Cura may optionally collect anonymised slicing statistics. You can enable this feature in the preferences."),
lifetime = 0,
dismissable = False,
title = catalog.i18nc("@info:title", "Collecting Data"))
Can i barge in and ask what "SliceInfo" includes?
See here: https://github.com/Ultimaker/Cura/blob/master/plugins/SliceInfoPlugin/SliceInfo.py#L65
Current time, Cura version (and OS), Active mode (recommended/custom), all setting values for the printer, extruders, per model settings, a hash of the meshdata (not the meshdata itself) and its transformations, and the outpiut device type (USB, OctoPrint, Local file, Removable drive, UM3 network)
We always asked if the data may be collected (and a user can opt-out). We never collected this data without asking.
None of this data is traceable to a user in any way. We can't do this, as that would mean we would have to use the opt-in system (as auto enabled systems are not allowed to gather personal information of any kind). As far as I know, you don't even need to ask for this kind of collection, but still provide a way to opt-out if you really really want to.
The new European privacy laws are pretty darn strict. The fines that are set on violating them are quite heavy (up to 20 million or 4% of revenue, whichever is higher). So even if you don't trust us to not collect data that we aren't supposed to, you can still trust the government to do something about it.
I protect my privacy in a fanatic way. My question was not to tell you that i don't trust you guys, but to help you. how this helps you? well, being direct has its benefits. if i only knew you are not collecting any private information AND you are not collecting the mesh itself - i would have checked this check box long time ago and allow you to collect my settings. the fact that i and I guess many others like me did not know exactly what you're collecting made me turn this option OFF and so you are not getting useful information to work with.
I have kinda hijacked this thread so i will move aside now.
Would it be possible to add a button or direct link to the first-run popup that allows disabling this without having to search for the option in the settings?
We've had this discussion multiple times already, both on GitHub and internally, and as we are currently fully compliant with legislation I doubt the behaviour will change anytime soon. We really need the data to improve Cura, and if you're really sensitive about it, it's still quite easy to disable it.
I think being a bit more explicit what is being collected (and directly providing a button to disable it) is a good compromise. At the very least, we should have a "show more info" button so that you can see what we send.
Yes, that would be great, @nallath !
As I said, I'm not against the feature personally, but others feel more strongly about it. I'd rather keep it in a way that satisfies both sides than having negative repercussions.
(and directly providing a button to disable it)
Adding that button to disable it has been an ambition of mine since I saw that message pop up for the first time. It's an ease of use that many other applications have but Cura lacked. So I made a PR and we'll see what comes of it once the team has to consider this seriously.
This has little to do with being compliant with legislation, but rather with being compliant with morality. Cura's sample size is big enough to make meaningful decisions anyway.
See also #337.
Thanks, @Ghostkeeper .
I didn't see the button in 3.1 yet, will it be in 3.2? If yes, I think this implementation should address https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=884709 sufficiently.
3.2 will have a button that takes you to the preferences window where you can disable it.
Note, as far as my understanding of the Debian project, based on more than 15 years of experience in the project, a 'call home' feature like this need to be disabled by default in Debian and only enabled if the user actively decide to change the default.
I would suggest the initial dialog should empower users to make this decision for themselves. Something like this might work for that (the wording assumes the submission is encrypted with TLS).
Would you like to send anonymous usage data to Ultimaker B.V.?
+---------------------------+ +------------------------+
| Send anonymous usage data | | Do not send usage data |
+---------------------------+ +------------------------+
If you decide to, this data will be available to Ultimaker B.V.
* IP address
* Current time
* Cura version
* Operating system
* Active mode (recommended/custom)
* All settings for the printer, extruders and each model
* A hash of the meshdata (not the meshdata itself) and its transformations
* The output device type (USB etc)
This data will be available to your Internet provider and other intermediaries such as cloud providers
* That you are sending usage data to Ultimaker
You can turn off/on sending usage data at any time in the preferences.
The initial dialog should also state what Ultimaker is doing with this data and what is the purpose of the collection.
We made changes to this flow in 3.2, I don't expect it to change again soon since we need to find a balance between collecting valuable (but anonymous) data and how easy it is to turn that off. I do agree we could specify why and what we collect in more detail.
We don't collect the IP address though, that's user identifiable information and forbidden by law. All other ways of collecting data are allowed, as long as we can't trace it back to unique users.
I was assuming your data collection end-point is fronted by a web server and that it has the default logging setup, which usually includes IP addresses.
[Paul Wise]
I was assuming your data collection end-point is fronted by a web server and that it has the default logging setup, which usually includes IP addresses.
And even if it is not stored on disk at the collection point, the IP address is sent from the user to the collection point due to how TCP/IP work, so it is provided to everyone listening on the traffic as well as the collection point itself. It would be misleading to not include the IP address when listing the set of information passed on to Ultimaker.
But if the upstream default is to send data, I believe the Debian package need to carry a patch to change the default.
-- Happy hacking Petter Reinholdtsen
@pabs3 we delete that data immediately, we don't store it anywhere.
Anyways, none if this can be decided by me or another developer, that's management level decisions. In my own opinion turning it off is so easy that it's not really a problem.
If it's required for the proper functioning of a feature, i don't think it should be seen as collecting data. If that's the case, every single website would need to provide a warning. That seems a bit overkill to me.
This is quite obviously a "won't fix" because nobody within Cura considers it an issue. Recent versions have a button within the notification to allow the user to disable it, so there shouldn't be any issue.
Also, like many tech-oriented people, I care a lot about my digital privacy. But I'm also understanding that all businesses in the world, for example a restaurant or a shop, need to keep track of what their customers order so they can buy ingredients, or which items sell well so they can do inventory management. Tracking your customers behavior (anonymously) is an important in virtually every business and and it holds true for improving FOSS as well. If you care at all about the good of Cura you will be understanding of why it's necessary and I think we're all smart enough to know that if it's not the default behavior 99% of users will not enable it simply because they don't care one way or another.
Thank you for the clare statement regarding your view on opt in vs. opt out. I see things differently, and find that tracking customers without their informed and explicit consent is unacceptable. And if I understand GDPR correctly, it will soon be against the privacy rules in EU.
As the users IP address is transferred to the collection point (required to get the TCP connection working), the collection is transfering personal information according to the law, at least here in Norway, and thus the claim about it being Anonymous statistics is obviously not quite accurate.
-- Happy hacking Petter Reinholdtsen
I see things differently, and find that tracking customers without their informed and explicit consent is unacceptable.
I'd love to go shopping with you and watch you tell the manager of the store that they need to destroy all internal records of your purchase and not factor you into their inventory management.
Anyway...
If the regulations change, it will be Ultimaker's legal department, not a GitHub comment thread, that enforces compliance. Obviously closing an issue doesn't mean we can't discuss it further, but don't expect any changes to Cura's source code on this topic for the time being.
[Ian Paschal]
I'd love to go shopping with you and watch you tell the manager of the store that they need to destroy all internal records of your purchase and not factor you into their inventory management.
You are welcome to join me. In the mean time, try paying with cash and not a card. Then there is no personal information to remove. I strongly recommend it, I do it all the time.
-- Happy hacking Petter Reinholdtsen
You are welcome to join me. In the mean time, try paying with cash and not a card. Then there is no personal information to remove. I strongly recommend it, I do it all the time.
Then you've just made the data the store collects anonymous. They're still tracking how many customers they had, and what they bought, so that they can better manage their staffing and inventory in the future. Or in the case of a restaurant, knowing what groceries to order and how much to prep the day before.
We do the same kind of thing, getting aggregate data on, say, which features our customers use the most, so that we can spend the most time on those. The data is quite boring and is not used for advertisements or sold to anyone. It's only function is––like the shop or restaurant data: to better serve our customers. Unlike those examples though, you have the freedom to opt out if you're keen on not helping us improve your user experience.
I'd say that this discussion is at this moment very relevant since the GDPR becomes enforcable in 7 weeks. At that point IP addresses become considered personal data (recital 30 of the GDPR).
However the original topic of this thread is a serious issue: The GDPR specificially bans pre-ticked boxes as form of consent, such as our pre-ticked preference for data collection (recitals 32 and 43).
I don't know what this entails for previous releases of Cura that we already released. Maybe we'll need to block connections from them server-side. If we want to keep collecting statistics from Cura 3.3 we'd need to comply within the next two weeks before the stable is released.
See above, closed that thread to continue the discussion all in one place.
I see this issue from three perspectives:
GDPR does appear to consider IP addresses personal information, which is a bit dumb, because as @nallath pointed out, the Internet doesn't really work without exchanging this information. Frankly GDPR puts virtually every website in existence right now out of compliance so in the back of my head this law already has "not really enforceable" written all over it. That being said, I think we should try for the sake of liability.
Since IP addresses do count as personal information and even if we do not store them, we do receive them, I don't think it's a very solid defense to say, "We super duper promise we destroyed the data immediately after receiving it," I do think this puts Cura (again, like literally everything ever) within the jurisdiction of GDPR.
That means that the buttons need to do what they say. It's a very easy fix and we'll likely still have thousands and thousands of data points from all the cool people who are happy to help us improve the software, but, it should mean that we don't have to worry about GDPR compliance because it's explicitly opting in, and very clear what the buttons do.
I don't think it's a very solid defense to say, "We super duper promise we destroyed the data immediately after receiving it," I do think this puts Cura (again, like literally everything ever) within the jurisdiction of GDPR.
Legally that is a valid defense though. Morally, not so much.
The "Disable" button should either be called what it does (e.g. "Modify in Preferences") or should do what its called (e.g. actually disable telemetry).
"Modify in Preferences" still does not describe what it does; pressing the button does not modify anything. "Open Preferences" would be a description of what the button does.
On a positive note, I think disabling SliceInfo is sufficiently easy, so the "opt-out later if you change your mind" is covered.
Concerning assumption of "previous consent", I think this should be addressed. As I understand, the consent message on first start was shown again in 3.1/3.2, even if the setting was previously enabled. If it is decided to improve the message, perhaps even add a checkbox to explicitly enable SliceInfo, it should definitely be shown again.
This doesn't cover users of previous versions of Cura, however. Of course, it's fair to request them to update the software, but this may not be an option for everybody, for various reasons. Perhaps it might actually be necessary to "burn" the previous API endpoint and use a new one, with a clear flag that consent was given?
On the other hand, I think the GDPR allows for cases where data is implicitly provided (like IP addresses) but explicitly not collected. I'm not sure how to address this from a legal point of view, however.
Holy shit I'm un-following this thing. Drags on for waaay to long.
Also, we don't store the IP addresses, they are stripped to country-level data in the request handling, so never stored on disk or in a database.
It is a solid defence. Storing the data is not allowed, receiving it is. If we need information as a part of providing a service we are allowed to use and keep it without permission for as long as we need it to do the thing we need to do.
What does that mean in this case (the "default" handling of a HTTP request)? We get to store the IP for as long as it takes to send a request back (as this is an unavoidable requirement of HTTP). Once the message is sent back, we lose this "right" as we no longer need to send a message, which puts us in violation if we keep it. So you are ever so slightly (but importantly) misunderstanding what GDPR is about. It's about storing personal data. If we want to do it, we need permission. As this is not the case, no permission is needed. As you already state; how the hell would any website be able to function ever again?
I'm obviously not up to date with the specifics regarding privacy laws in Norway, but I very much doubt that we are violating those. If we were, any and all websites or automatic update checks would be in violation.
So as far as I'm concerned we are both morally and legally in the clear, which as far as I'm aware has been discussed with lawyers who know far more about this than me.
All that being said; The button does indeed not do what it needs to do, which should be changed. As far as legal goes; We don't need the permission in the first place. It's a courtesy to ask it in the first place. Morally speaking you want to explicitly mention that it's going on and provide instructions how to disable it if you don't want to. We do this already, but not as well as we can (and should therefore improve on it). Burning all old data is just silly and unnecessary.
So in order to ensure that the button does what it claims it does; 18551a4a72cb037e83f105e4838349d981463e70
To further argue the point. See the below definition what is defined as personal data. Note that personal data is the only thing that the GDPR protects.
"Defining “personal data” under GDPR GDPR defines personal data broadly as “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.” Under this definition, nearly ALL information about a EU resident is personal data–including, for example, names, ages, Social Security Numbers, email addresses, online identifiers and location data, IP addresses and mobile device IDs, cookies, and also more sensitive personal data such as genetic data and biometric data, including fingerprints, facial recognition and retinal scans."
As the data we collect is not traceable to a natural person (and as long as we keep it that way) it's also not subject to GDPR. If we ever decide to become crooks and do make it tracable, we would be liable (and thus if it was ever found out, could be forced to pay tons and tons of money).
I am glad GDPR is getting some more attention here. A key question to ask is if it is required to be able to do a 3D print (aka use Cura) to send information that can be attributed a natural person. As far as I can tell, it isn't.
I especially like the huge penalties associated with GDPR. If I got it right, it can be up to four percent of gross revenue upwards limited to 20 000 000 EUR.
-- Happy hacking Petter Reinholdtsen
Its required to send a response back, which is the only purpose for which we keep it. As for the penalties, you are right.
Dont get me wrong, i am a huge proponent of very strict privacy regulations. But we are adhering to them and then some.
We also "get" your ip from the fact that we do an update check, which technically isnt required to 3D print. If that is the "border" no company would be able to provide services whatsoever.
Indeed. The only reason we temporarily get the IP address is because we use the HTTP protocol (like the entire Internet), which contains origin headers containing data like this. But instead of storing that IP address, we strip the last set of digits so we only store data that is not personally identifiable, but rather on a per-country basis. For this you'll have to believe us on our pretty blue eyes, but as @petterreinholdtsen and @nallath indicate, the fines are astronomical, we companies will think twice before breaking these rules.
I really don't see why this issue should remain re-opened....
As of @nallath's change I think we can consider it fixed and closed as regardless of the reading of the law, no one could claim that the button doesn't do what it says.
The reason it's come up again was because by coincidence @Ghostkeeper pointed out that we need to come up with a decision about whether or not to change things, and 10 minutes later, unaware of his reopening of this thread, I dug up the other one from the oldest page of issues because GDPR had been on my mind thanks to Wired and NYTimes articles about it lately.
Cura team discussed a bit today about what to do with old versions as this change would not be in older versions which we still distribute. Assuming @nallath is correct that as long as we don't save the logs, it's kosher, then there shouldn't be any need to "burn" old data.
And yeah at this point I really will defer to @nallath on this one. I've only speedreeded through GDPR and used the FAQ to double-check the IP question. They say:
What constitutes personal data? Any information related to a natural person or ‘Data Subject’, that can be used to directly or indirectly identify the person. It can be anything from a name, a photo, an email address, bank details, posts on social networking websites, medical information, or a computer IP address.
But yeah. Never mind.
Thank you for your rationale and interpretation of the coming GDPR rules.
I find it interesting to see how sematics (just receiving, not storing, collecting, but not keeping) is central to the justification of collecting information about its users from the Cura 3D slicer software. I wonder how the "solid defence" about only keeping the IP address long enough to send the HTTP response back will hold when it is demonstrated that the Cura 3D slicer software work perfectly fine without sending any information to its developers. As far as I can tell, this undermines the "need information as a part of providing a service" argument, as the service is running the Cura program, and the information submitted is not required for this to work.
@petterreinholdtsen On the other hand, a service provider is free to offer other services in addition to their main product. And users are free to make use of these services. Assuming that Ultimaker does something useful with the collected data (to the users, that is), I don't think it's fair to bar them from offering a telemetry feature. It's up to Ultimaker to justify what that use is. This is clearly part of the "conscious consent" provision.
This contrasts starkly with what Microsoft, Google and mobile app developers are doing, where telemetry is an inherent part of their products - while in many cases providing zero value to customers or failing to communicate that value clearly. I think the GDPR will become very, very relevant for these companies soon. On the other hand, user's data is part of the revenue stream, so one might argue that the product is "sold" in exchange for personal information. I don't think this will hold for products that cost money, however.
@petterreinholdtsen you have no idea how important data collection is for us in order to improve the software. Without it, we wouldn't be able to develop Cura in this rapid pace. So yeah, it is quite essential for us as a company to do this. That's why most software companies do this.
And again, temporarily having the IP address as part of the HTTP protocol (aka the f*cking Internet) is TOTALLY ALLOWED under GDPR. I'm doing it all caps here because no one seems to read...
And again, WE'RE NOT STORING ANY USER IDENTIFIABLE DATA (not even the full IP address), and nothing get's sold or sent to 3rd parties, it's 100% for improving the software. So we can debate here all we want, but we obviously checked this with legal council.
People are confusing "What data are we allowed to receive" with "What data are we allowed to store".
GDPR doesn't concern itself with the first. It concerns itself with storing data. So one could argue that we don't need the request with the data to function (and therefore, don't need the IP). But that's a moot point. We don't keep the data. That's what this is about. That's also the spirit of the law imho; We don't know what person A made. We don't know how they did it. We just know that someone did something. Thats why it's not personal data but just generic data.
So yeah, the IP is personal data. That's why we dont store it. Storing it would require permission (and wouldn't make it anonymous anymore).
I wonder how effective this legislation is.
Firstly, companies can gather all sorts of data from users if they agree to it. And they all too easily do. On a recent trip lots of people told me I should use a certain app to find campsites. But for that I had to give Google permission to use images from my smartphone. Those I told about this didn't know they had agreed to that. And I probably also agreed to lots of things on other occasions I wouldn't have agreed to if I had bothered to read the disclaimer. As I understand it, the GDPR still allows this. And one can get a user to agree to lots of things if that is needed for added functionality.
Secondly, if a user doesn't opt in, there's also data mining. Ultimaker can gather SliceInfo, which if extensive enough could be enough to uniquely identify users. Maybe not yet, but how far can one go with this? Imagine printers (or whatever) will be extremely modular in the future, something that 3D printing can make possible (and of course modules can be personalised). That might make a lot of equipment unique, and gathering info about the modules (and which variations on them) would of course make a lot of sense. Does the legislation anticipate on this?
I get the impression that this discussion is going nowhere. Ultimately, it is up to Ultimaker to decide if and how they will comply with GDPR. The community can give input on how they think the user experience should be, but any legal questions should be answered by a legal counsel, or at least by those actually responsible (i.e. Ultimaker's management board). A public bug tracker is hardly the right place.
My expectations as a user of Cura are as such:
Point 1. should already covered by the first-run message box. Point 2. is a bit lacking - I don't see a link or other UI element that show this information. Point 3. is there, but it's not presented in a very user-friendly manner.
My suggestion would be to improve this box; first by adding a link or button to the data collection policy (I assume it already exists somewhere). Secondly, there should be two buttons, one to enable SliceInfo and one to disable it. No data should be sent until this question is answered. To make the option more palatable to users, the message should be formulated in an encouraging way, while also explaining clearly that no personally identifiable information is stored anywhere.
Something like:
To improve Cura further, it is very important to us that we know how it is being used. For this reason, we are collecting fully anonymised slicing statistics on a server connected to the internet. The data we collect comprises a unique identifier calculated from the printed objects as well as the printing parameters. No personal information is stored at any point. Our full data collection policy can be found here. Cura can be fully used without enabling these statistics, but we would be very happy if you would provide us with them. If you change your mind, you can always toggle the setting here. [ Accept ] [ Decline ]
We are going to add the "This is the data that is being collected". It was one of those things that was on the radar, but never quite got the attention that it deserves. This discussion has given it a lot higher priority.
As for the point that the data could, potentially, be used someday to identify a user. If this is ever the case, there is only one thing that Ultimaker can legally do with that information; Delete it. If we would not, we would be in violation of the GDPR.
Also note that "giving permission to access something" is not the same as giving permisison to "Send it somewhere and use it for datamining purposes". The GDPR is rather specific about it; For every type of use a specific agreement must be given by the user. So if you give the application permission to analyse a picture to say, detect if a camping tent is present, that is the only thing the company is allowed to do with it. If they were to add facial recognition (or use it on it) they would need separate permission for this.
Our master branch now has the following dialogue with more information: The button in the notification at first launch now causes that dialogue to pop up, and the preferences dialogue still has the checkbox but has a button next to it that shows this dialogue again.
This is similar to what Onitake suggested and I'm okay with this solution. You think this is sufficient to end the discussion?
IMO this should already be closed as we discussed it and made a decision, and legally didn't need to change anything in the first place.
@Ghostkeeper Hmm... I think it's best to hide the example JSON blob behind a folding arrow.
Many users simply won't care about the exact data content or might even be put off by the wall of text. Those that care can click on the arrow to show the data.
Is that reasonable?
It's GDPR day!
Since this has been addressed I am going to consider this issue solved. Legally, everything is kosher, and within that, too much or not enough info is going to be subjective.
I think this one can be laid to rest.
According to #750, this issue has been raised before, but was only partially addressed.
In its current version, Cura tells the user on first launch that Ultimaker is collecting anonymous slicing statistics and that this can be turned off in the settings.
While this is apparently better than before (where data would be collected without the user's consent), it may still be problematic for some people. If possible, there should be a link in the dialogue that allows disabling data collection immediately. The text doesn't even mention where said option is.
Personally, I would find it preferable to make this option opt-in instead of opt-out by changing the default to false and adapting the text accordingly.
From the feedback I received from fellow Debian maintainers, it seems like leaving this option as-is is not acceptable and will have to be changed for the Debian releases. We are currently discussing if the plugin should be removed altogether (which, I think, is unfair towards Ultimaker) or if we should just change the data collection default to
False
.