dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.65k stars 1.06k forks source link

.NET core should not SPY on users by default #6145

Closed ghost closed 4 years ago

ghost commented 8 years ago

@blackdwarf @piotrMSFT I am very disappointed to discover that .NET core comes with a hidden and enabled spy utility that reports on its users. (Lakshanf/issue2066/telemetry dotnet/cli#2145). Apparently, MS has learned nothing from the backclash against Windows 10 spying on users. I suspect many will not want to install .NET core for this reason, which is a shame because .NET core is otherwise cool.

4creators commented 6 years ago

Telemetry issue is becoming really very serious once you get with your project into any kind of protected data. Just receiving IP of machines on which CLI is running violates privacy rules as IP is regarded in most EU countries as personal data - it may allow identification of an individual. Obviously one may turn it off after install but since it is set to on during install by default it violates active opt in rules.

During installation information on CLI telemetry is presented to users but it is not enough according to European Union regulations. If someone wants to collect any sort of data which is not related to basic functionality of provided software it is required by law to present to user active opt in choice with default setting during user interaction being opt-out. So no action from user always means "no telemetry allowed". This endorses rule of conscious user consent made after presenting unequivocal choice with default choice being no consent from user.

Microsoft entirely unnecessary exposes itself to legal risks at least in all EU jurisdictions ... fact that nothing happened in this area so far is no guarantee that it will not happen in future. Just learn from Facebook problems ....

Please take it as a very friendly comment ... like many others in this thread I have invested years in MSFT technologies and would like to see them succeed.

4creators commented 6 years ago

There is an important from the perspective of this thread issue What is the expected usage of source-built assets? in dotnet/source-build repo which indicates that it will be possible to build every non MSFT distro entirely from source (no binary blobs even for tools) and it will be possible to patch offline builds which otherwise should be binary compatible with MSFT build. This may solve issue of telemetry in general as it can be removed from CLI code.

b3nt0 commented 6 years ago

Maybe what MS should do is build a telemetry manager that all of their tools plug into. Then users would have a central place to opt-in or out of telemetry collection for an entire system.

The people complaining about this are probably the same people complaining about telemetry collection in every product that performs that function. If you centralize it, provide a way for people to inspect what is being sent back, and give it a big "OFF" switch, you could turn this whole story around.

It would probably also be a good idea to have a telemetry manager to control and protect the server end points that are taking data. Right now every team at MS is rolling their own system for collecting the data. Someone will end up getting that wrong and there will be a data breach. Formalizing your approach to collecting data on the "client" and sending to back end systems for reporting will protect MS as well as other entities.

How do you do that on Linux? That's really touchy. You could work with the distro vendors, and start an OSS project for the telemetry manager component. In the end, you are probably only going to be able to collect telemetry if you are VERY explicit about it during install on Linux systems. The overlap of privacy maximalists and Linux users is much higher than with Windows.

It's just going to keep coming up, and the "we told you so" moment is coming. You only have to get it wrong once.

migueldeicaza commented 6 years ago

Mhm I have never heard of mono-sgen64 calling mixpanel. This must be some third party library, not the runtime.

Got some repro step that shows this?

realityexists commented 6 years ago

@b3nt0 Even if Microsoft implements a tool they can re-use to collect telemetry data across all products, I just don't see them building a client tool that makes it easy for the user to opt out for the simple reason that it's not in Microsoft's interest to do so. It's in their interest to make it possible to opt out, so the product can be used even in scenarios where telemetry is unacceptable, but it's not in their interest to make it easy.

markrendle commented 6 years ago

I'm working on a project related this sort of thing (not .NET specific), and I wondered if the more vocal anti-telemetry posters in this thread could maybe explain to me exactly what concerns them about this type of telemetry collection.

Genuine question, and no personally identifiable data will be collected :wink:

nathan-alden-inl commented 6 years ago

It's not the telemetry itself, it's:

  1. the default being opt-out, not opt-in.
  2. their stubbornness against making it opt-in.
  3. their attempts to disguise their true motives.

I think some people are simply making a stand against data collection because it seems corporate culture is hellbent on it at all costs. There's a bigger battle being fought, and this is just one skirmish on the edge of that battle.

cznic commented 6 years ago

Genuine question, and no personally identifiable data will be collected :wink:

Because telemetry always collects personal data, unless it calls "home" through some IP anonymizer service, which I doubt is the case.

Disclaimer: I don't use, nor I think I ever will, any MS SW. This thread shows just one of the reasons.

Disclaimer2: I do allow opt-in telemetry by some companies I do trust, when they ask nicely for my opt-in and I think it can be indirectly helpful also for myself.

markrendle commented 6 years ago

@nathan-alden-inl But there is obviously some underlying "telemetry considered harmful" attitude, otherwise there would be no problem with it being on by default. There are plenty of things in life that are on-by-default but can be turned off - like the Electronic Stability Control in my car. I'm interested in what people consider to be so bad about seemingly innocuous telemetry that mean it should be opt-in.

markrendle commented 6 years ago

@cznic Why do you have a problem with your IP address being collected? Do you have a problem with it being attached to all your posts in this thread?

cznic commented 6 years ago

Why do you have a problem with your IP address being collected? Do you have a problem with it being attached to all your posts in this thread?

No, why I should? And if I would have a problem with it, I would hide it.

It depends on what one is doing and who collects the data. Especially the later is significant. I share much more than just an IP on Github, they also have my credit card info, for example. I consider Github trustworthy, no bad record I know of. MS? Not so much. (OK, TBH, %0 trustwothiness in my personal opinion.)

dotaheor commented 6 years ago

@markrendle it is simple. dotnet cli telemetry collects data I'm not aware of, but I'm aware of a client-server app will get my ip. I HAVE TO provide my ip to use a client-server app, otherwise I can't use the client-server app. However, it is TOTALLY NOT essential to provide my ip to dotnet cli.

And I don't consider telemetry harmful, I just think telemetry should respect users.

markrendle commented 6 years ago

@dotaheor What is wrong with a server getting your IP address?

dotaheor commented 6 years ago

@markrendle it is no problem for a server to get my ip. It is just that the server should let me be aware of it gets my ip.

markrendle commented 6 years ago

@cznic So for you it is important to easily determine who may and may not collect data at a "corporate entity" level?

markrendle commented 6 years ago

@dotaheor So if the data could be sent in a way which did not reveal your IP address to the telemeter, you wouldn't mind?

cznic commented 6 years ago

So for you it is important to easily determine who may and may not collect data at a "corporate entity" level?

I see where are you trying to lead me ;-)

Recent illustration: https://betanews.com/2017/10/09/cortana-skype/

migueldeicaza commented 6 years ago

I looked into this, it is not mono-sgen64 that is calling into MixPanel, it is Visual Studio for Mac that does. To disable this, go into Preferences -> Other -> Feedback and click to disable the option.

dotaheor commented 6 years ago

@markrendle

So if the data could be sent in a way which did not reveal your IP address to the telemeter, you wouldn't mind?

You must be kidding. Here is the "ip" we are talking about is just an example of some private data.

markrendle commented 6 years ago

@cznic Not trying to lead you anywhere, just doing some research. Thank you for your answers, very helpful.

dotaheor commented 6 years ago

@markrendle to state it again. I don't care about my private data is collected, but I must be aware of what of my private data is collected and how it is collected.

markrendle commented 6 years ago

@dotaheor OK, thanks, I think I get where you're coming from.

jkoberg commented 6 years ago

@markrendle did you miss the comments from the poster who's .net-based work in a HIPPA (US health privacy law) regulated project is now cancelled, because they can't permit tools that call home?

"I can't think of why it might be bad" is not a sufficient reason; it's an argument from ignorance. Trust that it takes the tool out of consideration in high-security environments, and not for personal/moral reasons.

markrendle commented 6 years ago

@jkoberg I saw them; I'm currently working in a finance org where the proxy blocks calls to a lot of telemetry servers (and many others: e.g. I can't install VS Code extensions through the application itself) for similar reasons. I appreciate that there are environments where any leakage of any data is in violation of regulations, and that demands something other than an opt-in/opt-out preference offered to users at installation time. It's also why the SDK collects telemetry, but the runtime-only dotnet that is to be deployed into production environments does not; production systems obviously need to be much more locked down than developer workstations.

My interest is in the individuals who instinctively resent the collection of what many would consider to be harmless and potentially beneficial telemetry data from their own environments.

svick commented 6 years ago

@dotaheor Do you find the notice that's shown in the installer and before the first dotnet command is run insufficient? If so, what else can be done to make you better aware of it?

dotaheor commented 6 years ago

@svick for me, it is not a big problem, for I will set the DOTNET_CLI_TELEMETRY_OPTOUT anyway.

However, this project should state a requirement in the license to third-party tools which includes the cli that the tools must also present the notice you mentioned to users.

Besides the above point, I think the data list below which will be collected by this cli is unnecessary and unpleasant:

iigorr commented 6 years ago

The topic is trending on hacker news right now. https://news.ycombinator.com/item?id=15439001 Stupid decisions which ruin great projects... great job.

benaadams commented 6 years ago

My understanding, from reading the source, of what is happening

The runtime itself does not send telemetry so you shouldn't be sending any in telemetry production regardless of the environment setting by using the runtime package and running your pre-built app with the shared runtime by doing:

dotnet my.dll

This is also confirmed as the intention from the documentation

Telemetry isn't enabled when using the dotnet command itself, with no command attached:

  • dotnet
  • dotnet [path-to-app]

Equally a self-contained deployment will produce an executable file that requires nothing to be installed on the target machine and will not send any telemetry when running that executable.

The sdk does send telemetry; this is also confirmed as the intention from the documentation

Telemetry is enabled when using the .NET Core CLI commands, such as:

  • dotnet build
  • dotnet pack
  • dotnet restore
  • dotnet run

The first install/run at command line that prints the message about DOTNET_CLI_TELEMETRY_OPTOUT uses the FirstTimeUseNoticeSentinel which does not send any telemetry if it also is printing the message; so you are warned prior to it sending telemetry.

It does not send any telemetry after that if the DOTNET_CLI_TELEMETRY_OPTOUT set to one of true, 1, yes (case insensitive)

The HIPAA example @sushihangover gave compiles programs in production so it is using the full sdk rather than just the runtime.

The workarounds to the telemetry when using the sdk (build/run etc) as I understand them are:

Not sure what the Windows desktop installer does first time; there is a IsDotnetBeingInvokedFromNativeInstaller path which does something different

tl;dr

If you receive the message

Welcome to .NET Core! `--------------------- Learn more about .NET Core: https://aka.ms/dotnet-docs Use 'dotnet --help' to see available commands or visit: https://aka.ms/dotnet-cli-docs

Telemetry `--------- The .NET Core tools collect usage data in order to help us improve your experience. The data is anonymous and doesn't include command-line arguments. The data is collected by Microsoft and shared with the community. You can opt-out of telemetry by setting the DOTNET_CLI_TELEMETRY_OPTOUT environment variable to '1' or 'true' using your favorite shell.

Read more about .NET Core CLI Tools telemetry: https://aka.ms/dotnet-cli-telemetry

That use will not report any telemetry; but it will add a local sentinel file to mark that it has warned you

Running your app directly will not send any telemetry e.g.

dotnet [path-to-app] rather than doing dotnet run

This is what you should be doing, generally, in production anyway, rather than using the sdk commands

dotaheor commented 6 years ago

@benaadams can you make sure that set the environment will take effect for the current opened terminal session or not?

In my honest opinion, when first time run the cli, the cli should present a [Y/n] choice to user, instead of letting users to set an environment. If you let you set an environment, the environment will only take effect from the next time a terminal is opened.

And the user choice should be saved in .dotnet.conf alike file instead. The .dotnet.conf file should the same file saved the info of whether or not user has made a choice. Later, user may delete the environment but the .dotnet.conf file may still record that user has made a choice.

benaadams commented 6 years ago

@dotaheor for bash in linux for current session export DOTNET_CLI_TELEMETRY_OPTOUT="1" and add same line to ~/.bashrc for future sessions; or whatever's appropriate for shell you use ~/.profile for terminals on Ubuntu desktop or equivalent for your distro

Windows cmd Current session set DOTNET_CLI_TELEMETRY_OPTOUT=1 Future sessions This PC -> Properties -> Advanced -> Environment Variables -> New User variable (Or System Variable) Or IT admin via group policies etc

Don't know how macOS works

I assume environment variables are an easy thing to explain and get working cross platform. User home directory is more complex to explain cross platform - could always do both though

dotaheor commented 6 years ago

@benaadams

for bash in linux for current session

export DOTNET_CLI_TELEMETRY_OPTOUT="1" and add same line to ~/.bashrc for future sessions; or whatever's appropriate for shell you use ~/.profile for terminals on Ubuntu desktop or equivalent for your distro

I know this. But this doesn't take effect for the already opened terminals.

I assume environment variables are an easy thing to explain and get working cross platform. User home directory is more complex to explain cross platform - could always do both though

There is already a file for saving the info of whether or not user has made a choice. So it is not a bad idea to also save user choice there.

kspeakman commented 6 years ago

@markrendle I have a question for you. MS used to have opt-in telemetry for nearly all products. But in the last years has switched to opt-out for basically everything. Why?

You should also review the whole thread for other reasons why people don't like opt-out. It's like taking pictures and video of me and my home without asking. It's like reading someone's text messages over their shoulder. It's creepy. And your promise of how the info is used today means nothing tomorrow.

kstarikov commented 6 years ago

I don't think I'll be able to use .NET Core at work until telemetry becomes opt-in.

oaiey commented 6 years ago

Thanks @benaadams for the audit.

Regards the HIPAA violation of @sushihangover: I work for a healthcare business. We have 5-7 development stacks, one of them .NET Core. This issue is mitigated very easy and would never give us any troubles. If the customer jumps off a platform that easy he either does not understand much about software engineering, is ill advised or (worst case) has no idea about how to run a business in a regulated environment.

cznic commented 6 years ago

If the customer jumps off a platform that easy he either does not understand much about software engineering, is ill advised or (worst case) has no idea about how to run a business in a regulated environment.

If a customer jumps on a spyware infested platform that easy she either does not understand much about software engineering, is ill advised or (worst case) has no idea about how to run a business in a regulated environment.

voronoipotato commented 6 years ago

Let's be clear it's an obvious breach of trust, and anyone working in environments with sensitive data should avoid dotnet core arguably indefinitely. @oaiey it is better to be safe than to be sorry, i'm glad you trust microsoft with your wellbeing but I can't always read every check in to make sure they aren't sliding in telemetry that likely increases the attack surface.

@kstarikov Same, and I'm glad I had only played with it at home.

voronoipotato commented 6 years ago

@markrendle "But there is obviously some underlying "telemetry considered harmful" attitude, otherwise there would be no problem with it being on by default."

You have a clear issue understanding consent. Forcing me to drink tea, or trying to pour it in my mouth when I'm sleeping does not mean I object to tea. It means I object to microsoft being slimy and creepy around consent and I will use other products/vendors when they do this. The difference between this and say traction control in a car is it's something I have, that you want. If you want something you have to ask for it, you don't get to just take it.

migueldeicaza commented 6 years ago

@sushihangover On MixPanel, we looked into this, and while we used it a few years ago, we do not see traces of it on the source code, or the binaries. We are puzzled as to where this might be coming from, and wondering if perhaps you have some third-party add-ins installed, or some external tool and perhaps those are using it?

Would love to know how you triggered this.

svick commented 6 years ago

@voronoipotato

When you go to a concert, do you have to consent to being recorded by a clearly visible camera as a face in the crowd? I think the situation here is similar to that.

You can decide that this kind of recording of your actions is not okay for you and not use a product that does it. But I don't think it's as black & white as you make it sound, that it's Microsoft taking something from you without your consent.

ghost commented 6 years ago

Hashed MAC address: a cryptographically (SHA256) anonymous and unique ID for a machine. This metric is not published.

And how much minutes is it needed to create table for MAC or IPv4 addresses (which are mostly used everywhere.. 80% of all internet users)?

Three octet IP address used to determine geographical location†

It is enough to get subnet... and then target machine with help of MAC...

Microsoft is kidding us... :)

When you go to a concert, do you have to consent to being recorded by a clearly visible camera as a face in the crowd?

You can compare concert and platform which will be invested in millions/billions dollars worldwide by software developers and different product vendors... hmm.

I think when you are on concert you know where you are... even when you are in the street you know where you are. And I personally do not like that.. but it happens.. look here:

https://www.youtube.com/watch?v=Doqg1eCQieo

But .net core is a tool... and even a small breach of privacy is a drawback of this tool. It is just fact. What do you want to say by your posts? Privacy is not important? I have different opinion and big data examples confirm my thoughts very good.

oaiey commented 6 years ago

@voronoipotato trust is a valid point. However, I trust their commercial interest a lot that they do not do bullshit. A serious HIPAA violation would be e.g. access to patient data. But we are not talking about that here.

Also regards trust: whom should I trust? The Node platform with npm and their packages with anonymous author and missing public repos? The Java platform with their installer? That leaves C++ and doing everything on my own with high costs. But that is financially not reasonable for e.g. a genome analyzer. For a insulin pump, yes.

It is not about trust. It is about being unprofessional and delivering without knowing what you deliver. In the current of state the telemetry in .NET Core SDK is well known and understood.

ghost commented 6 years ago

So, it is the case that after a year+ of this going on, that literally nothing of substantive has come from gathering this telemetry (go ahead and argue with me on that one, but I will vehemently disagree and preemptively argue that you are just adorning something useless with bells; polishing a turd, if you will). I think it is time for it to go as it is just a leveraging point against dotnet, and does nothing to accomplish the real objective: wide-spread adoption of the CLI. In fact, it can be argued that it is now inhibiting this outcome.

Maybe other CLIs do it, and maybe there is a vibe of irrationality with regards to the arguments waiving the banner of privacy (and we know Microsoft is not a big fan of privacy, if you want this opinion to change then demonstration by action is a great move), but no matter: It isn't actually accomplishing anything other than to distract people from real discussions about real possible features for the dotnet CLI. Or maybe that is the objective in itself: No such thing as bad press? That's just stupid PR if that's the case. Unless the objective is to lower your public perception so much so that no matter what step you take next there is only up; this is also a pretty distressing idea to consider.

Anyways, if developers come to you with their opinions, you should be more than eager to oblige, and you should be operating from the perspective to adopt as many users as possible. Microsoft has not needed telemetry in the past to produce good products like Visual Studio (maybe I am wrong?) and so I struggle to see why you need it now. Good experiences are not built from telemetry, they are intuited by experienced designers or developed thoughtfully and iteratively. See the Tesla models, or the iPods, or even your very own Surface Book.

I don't generally think that telemetry is harmful, but in this instance I do believe that it is harmful to the perception of the CLI. Additionally, there are people literally telling you: "I wanted to use this, but I could not because of the telemetry feature"; what a stupid reason to lose a customer over. There are many different avenues customers can be driven through to attract the same information that is being gathered here, normally opt-in related which is actually what you want in your telemetry anyways. Because, lets be honest: you don't actually care about the normal behavior, you want to hear about the joy points, and the pain points, and such responses are only elicited in the extrema. So long as you facilitate the reporting of bugs (does the CLI gather bug reports?), or opinion (does the CLI have opinion gathering mechanisms?) or so long as you peruse the internet/read GitHub, you are going to be OK.

Maybe now is a good time to be reminded: You do not exist without your customers. Spend a few cycles building things that will cultivate trust and open communication. Think outside the box for a bit, the "me too" behaviors are disheartening.

olejorgensen commented 6 years ago

Windows 10 is the Miss Piggy Edition of Windows - under all that lipstick it's still just Windows. Kind of fitting that .NET Core also is the Miss Piggy Edition of .NET's

chrisjsmith commented 6 years ago

@oaiey I think the JVM is clearly the future. You don't need the Oracle installer. It's mature, cross platform, has a wealth of mature libraries, tools and integrations and the build and deployment infrastructure is done and dusted. You don't have to use Java if you don't want to either. It was the birth place of a lot of the design methodologies and bits of infrastructure that are crudely copied on the CLR.

I'm really wishing I took the blue pill back in 2002 and went down the Java route at this point. The red pill was a bad choice.

To be clear, a lot of people are looking at HIPAA compliance as "for use in medical devices". This isn't the case. PII and control of data is the problem. Telemetry is fundamentally incompatible with that as it decreases the signal-to-noise ratio.

Another analogy: You don't design a secure system with the starting point of a sieve and try and fill all the holes with environment variables. You start with something that is locked down and open what you need. The problem is that telemetry uses the same protocols and ports as the legitimate and controlled data which adds noise and risk because every communication needs to be understood. That control has to happen in every part of the environment from the workstations the developers use, through to the build infrastructure, through to the production environment. And that is why Windows 10 and this sudden increase in telemetry in developer tooling is an absolute nightmare for us.

markrendle commented 6 years ago

Interesting aside: I just installed Firefox Developer Edition, from noted privacy stalwarts Mozilla.

Here's the privacy notice you get when you first run it:

image

markrendle commented 6 years ago

Unrelated to my project research, but another thing occurred to me.

In the UK (where I live) we are changing to an opt-out model for organ donation. The rationale is that most people think that organ donation is a positive thing and are not opposed to it, but never get around to opting-in for it. By making it opt-out, the (estimated) minority who—for whatever reason—want to hold on to their organs after they die are able to make that choice, but most people won't do that and many lives will be saved as a result.

It's something that is being done for the good of the many, but still provides an opt-out mechanism for the few who are opposed to participating.

kstarikov commented 6 years ago

But it looks like 'few' prefer opt-out telemetry in dotnet while 'many' want it to be opt-in or nonexistent.

markrendle commented 6 years ago

@kstarikov Can you link me to your statistical source on that?

dazinator commented 6 years ago

Can you link me to your statistical source on that?

@markrendle - Perhaps Microsoft could gather some telemetry for you to use as a statistical source on this question. The installer and cli can have its message changed - "those that would prefer an opt-in for telemetry in future should add an environment variable 'I-PREFER-OPTIN' but leave telemetry enabled." Then the telemetry can call home with the value of that variable - and voilla you have a statistical source as to those users opposed :-)

dazinator commented 6 years ago

@markrendle - because the only answer for unknown questions is obviously more telemetry at this point :stuck_out_tongue: