LuaLS / lua-language-server

A language server that offers Lua language support - programmed in Lua
https://luals.github.io
MIT License
3.3k stars 306 forks source link

Silent opt-out telemetry is harmful #462

Closed clason closed 3 years ago

clason commented 3 years ago

Lua-language-server has a telemetry function that is enabled by default.

While I understand that telemetry is useful and that making it opt-in will significantly limit that usefulness (because almost no-one will actively enable it), in times of increasing concern for privacy and data autonomy, this kind of silent telemetry (enabled by default without notification, no prominent mention in the README) will needlessly harm users' trust in the project. (To be clear, I'm not accusing you of bad faith here! I'm just describing how people react to such a feature in general.)

I therefore suggest the following: If the telemetry.enable config key is not set (to either true or false), lua-language server should print a message on startup saying

  1. that the project uses telemetry,
  2. where to find information about the collected data (what and where),
  3. how to disable it (or enable it to remove the notice).

In addition, it might be better to put the whole block https://github.com/sumneko/lua-language-server/blob/6ba92a465c2686fe421fc398251d45c5d4381061/script/service/telemetry.lua#L62-L87 behind a if config.config.telemetry.enable guard so the timers aren't even run if the setting is not enabled.

phmarek commented 3 years ago

I vote for disabling telemetry be default; yes, you can print a message on first start, but enabling unless the user explicitly changes config files is the wrong way to go.

Yes, that may make the function useless - IMO it shouldn't be there at all!

runiq commented 3 years ago

I'd like to reiterate a few points from #463 here:

For everything else, I defer to the OP :)

Edit: To be even more clear, this will be a GDPR violation until it is made opt-in by default, as far as I understand the law.

teto commented 3 years ago

thanks for the great software but a bit mad at this being enabled by default. Care to explain the advantages of telemetry ? compared to bug reports let's say.

sumneko commented 3 years ago

While I understand that telemetry is useful and that making it opt-in will significantly limit that usefulness (because almost no-one will actively enable it), in times of increasing concern for privacy and data autonomy, this kind of silent telemetry (enabled by default without notification, no prominent mention in the README) will needlessly harm users' trust in the project. (To be clear, I'm not accusing you of bad faith here! I'm just describing how people react to such a feature in general.)

I will mention it in README. Currently it has been mentioned in changelog and setting

sumneko commented 3 years ago

I therefore suggest the following: If the telemetry.enable config key is not set (to either true or false), lua-language server should print a message on startup saying

  1. that the project uses telemetry,
  2. where to find information about the collected data (what and where),
  3. how to disable it (or enable it to remove the notice).

I inspected some other software at the time. The telemetry function of VSCode is turned on by default. I just confirmed that he only mentioned this in the "I have read and agreed" of the giant during installation, and there has been no runtime prompt since then. . Many other language services with telemetry capabilities do not even provide the option to turn off. I don’t know how you found out that I have telemetry capabilities to initiate a protest, I hope it’s not because of the shutdown option I kindly provided.

sumneko commented 3 years ago

In addition, it might be better to put the whole block

https://github.com/sumneko/lua-language-server/blob/6ba92a465c2686fe421fc398251d45c5d4381061/script/service/telemetry.lua#L62-L87

behind a if config.config.telemetry.enable guard so the timers aren't even run if the setting is not enabled.

This is an easy way to support immediate application of telemetry option changes without restarting the language service. I'm lazy, and I don't want to write so much code to support perfect timer management or pop up to notify users that they need to restart to take effect.

runiq commented 3 years ago

I inspected some other software at the time. The telemetry function of VSCode is turned on by default. I just confirmed that he only mentioned this in the "I have read and agreed" of the giant during installation, and there has been no runtime prompt since then.

I agree that's disingenious and par for the course for MS, but they do tell the user up front what they're doing, and that's still strictly better than just silently enabling things: If I read the TOS (I know, BIG if), and then decide I don't want to use the software because it phones home, I can do that.

That's also why I suggested to put a notice right at the end of the installation routine :)

phmarek commented 3 years ago

This is an easy way to support immediate application of telemetry option changes without restarting the language service. I'm lazy, and I don't want to write so much code to support perfect timer management or pop up to notify users that they need to restart to take effect.

But it also needlessly eats resources.

Please just take the telemetry out again, and provide a simple way to provide data - like the :HealthCheck function in nvim, for example.

clason commented 3 years ago

I inspected some other software at the time. The telemetry function of VSCode is turned on by default. I just confirmed that he only mentioned this in the "I have read and agreed" of the giant during installation, and there has been no runtime prompt since then.

The EULA does make a difference, legally -- and I would caution copying commercial projects with a big legal department that can handle any lawsuits arising from this. I need to repeat what @runiq mentioned: The way telemetry is implemented now is illegal in Europe (it violates the GDPR). While you of course can't be sued about it, it does make it illegal for me to use it in the context of my work.

Many other language services with telemetry capabilities do not even provide the option to turn off. I don’t know how you found out that I have telemetry capabilities to initiate a protest, I hope it’s not because of the shutdown option I kindly provided.

No, it was because a firewall flagged suspicious activity.

sumneko commented 3 years ago

I vote for disabling telemetry be default; yes, you can print a message on first start, but enabling unless the user explicitly changes config files is the wrong way to go.

Yes, that may make the function useless - IMO it shouldn't be there at all!

I can change it to a pop-up window to inform you of this during runtime, but I insist on enabling the telemetry function by default. The telemetry function is very useful, and it means a lot to me. Let me talk about the error reporting function first. I fixed about 30 runtime errors through the error log. You can find a lot of FIX a lot of runtime errors in the change log. The reason for my initial development of this project is simple: I am a user of Lua, but VSCode does not have a relatively powerful Lua language service, so I decided to make one of my own and only use it for myself and my colleagues in the company. As the project gradually iterated, I found that the project became the mainstream lua language service in VSCode. Now that many people are using it, I continue to develop and iterate, and this will last for 3 years. During these 3 years, I have not received any income (but a friend once said on Steam that he would give me a game, but the game type did not meet my preferences, so I refused). No one has participated in the long-term contribution (I used Chinese to submit at the beginning, but some people suggested that I change to English so that others can participate, and I did it). The only thing that supports my continued development is that I think many people are using it. My company has at least 50 people using it, and there are many companies that use lua to develop games (now the mobile game is a development boom), so I assume there are 10,000 people using it. Until I counted the number of people online through the telemetry function, there were only 1,000 people! I was really frustrated for a while, but I finally decided to continue to update and set a goal: 10,000 people use it. Currently you can find online statistics at http://119.45.194.183/. When I was writing this text, there were 3,600 people. That's why I say that he means a lot to me.

clason commented 3 years ago

I understand; that's why I didn't insist on removing telemetry or turning it off by default. It's the "silent" part I (and the GDPR) have problems with.

And let me say that it's an excellent server and your support is really amazing! (Otherwise I wouldn't have bothered with opening an issue and just stopped using it.)

sumneko commented 3 years ago

I'd like to reiterate a few points from #463 here:

  • Right now, the way telemetry handling is done is a GDPR violation
  • A message at the end of the installation routine is probably a good idea

For everything else, I defer to the OP :)

Edit: To be even more clear, this will be a GDPR violation until it is made opt-in by default, as far as I understand the law.

The installation process is implemented by the client, and I cannot insert instructions. What I can do is to tell you this through a pop-up notification every time you start the service, until you choose not to remind.

clason commented 3 years ago

The installation process is implemented by the client, and I cannot insert instructions. What I can do is to tell you this through a pop-up notification every time you start the service, until you choose not to remind.

That is exactly what I am proposing: If the telemetry.enable setting is not set by the user (either to true or false), send a notification. I'd be happy to help with finding a clear and concise wording for this notification.

sumneko commented 3 years ago

thanks for the great software but a bit mad at this being enabled by default. Care to explain the advantages of telemetry ? compared to bug reports let's say.

Although I am not willing to admit it, Lua is really not a mainstream programming language. At least in China, 80% of Lua users are game planners rather than programmers. They use lua to write data configurations, simple scripts, and damage calculation formulas. This means that they will not give any feedback when they encounter problems: "Github, what is that? How is it in plain English? Why do I have to register for an account to report problems?".

sumneko commented 3 years ago

But it also needlessly eats resources.

Please just take the telemetry out again, and provide a simple way to provide data - like the :HealthCheck function in nvim, for example.

You are right, meaningless cpu consumption is really not supposed to be. But please take a look at the issues, there are dozens of urgent issues that are more serious than this one. If you want, you can submit a PR to help me solve this problem, thank you.

sumneko commented 3 years ago

The EULA does make a difference, legally -- and I would caution copying commercial projects with a big legal department that can handle any lawsuits arising from this. I need to repeat what @runiq mentioned: The way telemetry is implemented now is illegal in Europe (it violates the GDPR). While you of course can't be sued about it, it does make it illegal for me to use it in the context of my work.

I am sorry that I only learned about GDPR for the first time today. Please help me to confirm whether the following practices are appropriate:

  1. Keep telemetry turned on by default
  2. After 30 minutes of uploading data for the first time, ensure that users can disable it in time
  3. Add telemetry related content in the readme.
  4. Inform the user of telemetry related content at each startup, until the user chooses not to remind or disable telemetry.

Telemetry related content includes:

  1. What data will be collected (token used to count online people, your operating system platform, compiler version, C++ runtime version, language client name, language server error log)
  2. View the website of the collected data (http://119.45.194.183/)
  3. Code for collecting data (https://github.com/sumneko/lua-language-server/blob/master/script/service/telemetry.lua)
  4. Code for processing data (https://github.com/sumneko/lua-telemetry-server/tree/master/method)
clason commented 3 years ago

I am not a lawyer, so this cannot be reliable advice. But by my understanding:

  1. is OK
  2. is not OK -- you need to be able to opt-out before data is transmitted
  3. is OK
  4. is OK (and make it blocking if possible to avoid sending unwanted data)

The telemetry-related content looks good to me, but I want to point out that the collected data (in particular, the error messages) may include confidential information (file names, variables?) unless you are sure that they only contain information about lua-language-server and not user code.

(I understand your pain -- privacy is hard since it requires balancing fundamentally opposed legitimate interests. There is no perfect solution here...)

sumneko commented 3 years ago

The telemetry-related content looks good to me, but I want to point out that the collected data (in particular, the error messages) may include confidential information (file names, variables?) unless you are sure that they only contain information about lua-language-server and not user code.

The error log only contains the stack of the language service itself, and does not contain user code. The only possible problem is that user name and other information may appear in the location of the language service file. I am considering modifying the require function to only record the relative path of the file.

clason commented 3 years ago

Also, I would recommend looking how the Julia Language server handles this; see, e.g., https://github.com/julia-vscode/julia-vscode/wiki/Privacy-Policy

There was a long and very heated debate about this when they started doing it...

runiq commented 3 years ago

If there's a LOUD error message at startup that tells me to enable/disable telemetry or your langserver will refuse to work, I will breathe a sigh of relief, enable telemetry, and go on with my life. You're doing hard work here, no need for me to make it even harder.

My only issue was with the GDPR thing and telemetry being enabled by default without it being mentioned anywhere—if that is kept, I am not legally allowed to use your langserver at my work. :(

b0kker commented 3 years ago

It was brought to my attention by the IT guy who manages our corporate firewall that the domain name used by this extension for telemetry triggered the child pornography filters. (moe-loli.love)

Glaived commented 3 years ago

It was brought to my attention by the IT guy who manages our corporate firewall that the domain name used by this extension for telemetry triggered the child pornography filters. (moe-loli.love)

Funny. But certainly not interesting, it is very easy to say that without proof and disclaiming any responsibility. As a reminder, a domain name does not make the content of it, and a domain name can be redeemed. If you have evidence, thank you for sending it to us.

sumneko commented 3 years ago

It was brought to my attention by the IT guy who manages our corporate firewall that the domain name used by this extension for telemetry triggered the child pornography filters. (moe-loli.love)

Im sorry to hear this. I use this domain name just for funny, as .love is very cheap, loli is a commendatory term in my country, and as you see I like anime. If it does cause you trouble, I will change this domain name.

clason commented 3 years ago

Im sorry to hear this. I use this domain name just for funny, as .love is very cheap, loli is a commendatory term in my country, and as you see I like anime. If it does cause you trouble, I will change this domain name.

If I may add my opinion as the OP: I assumed that the name was chosen for fun and no harm intended, but the optics are indeed not good, at least in Europe/US. At best, it makes the project look much less serious than it is; at worst, it does trigger connotations that are now seen as at the very least distasteful. (Public opinion has changed since the 90s/2000s when this was just a harmless subculture meme. Just to be clear: I'm not saying you cannot use these terms ever; just that I would counsel against using them outside their "natural" fandom setting.)

If you are tied to the .love domain because of funding, I would recommend just removing the domain name completely from the code and always connecting to the raw IP. This will forestall any questions.

sumneko commented 3 years ago

Should be resolved.

clason commented 3 years ago

Thank you! I really appreciate that you took this seriously and worked so hard to resolve it quickly. I know privacy issues are annoying and much less interesting to deal with than new features.

serg3295 commented 3 years ago

The policy of collecting personal data by IT giants Microsoft and Google has led to users will turn off any telemetry if they have the opportunity to do so. And nobody will read the Privacy policy at all.

It is see, that as a result of the change in the opt-in policy, the users counter is decreased dramatically. In order for the counter to reflect the number of real active users, I suggest the following:

If the setting Lua.telemetry.enable is set to false, display a pop-up message about the need to read the privacy policy every 15 minutes.

In this case, there will be no GDPR violation and the user will have a trial period of 15 minutes to understand how an excellent this extension is, read more the privacy policy, and set Lua.telemetry enable to true. 😄

Suggested changes in Private policy: The paragraph "If you choose to enable telemetry, ..." move from Opt-in Policy section to the Overview one.

The sections could arrange in the following order -- Overview -Collected data - Use of collected data - Opt-in Policy - Disable Telemetry.

Sylveondev commented 3 years ago

yikes

kevinhwang91 commented 2 years ago

I was really frustrated for a while, but I finally decided to continue to update and set a goal: 10,000 people use it. Currently you can find online statistics at http://119.45.194.183/. When I was writing this text, there were 3,600 people. That's why I say that he means a lot to me.

太不容易了,这个项目做得是真的好,甚至不输主流ls。要是没有这个项目我真的不想用lua写nvim配置和插件。 使用人数肯定远远大于3600的,只是大部分人都关闭了没被统计(虽然我也是关了,狗头)

martin-braun commented 8 months ago

I ended up here, because the server tried to connect to moe-moe.love, formerly known as moe-loli.love. I don't have to tell you that those domain names are very questionable, @sumneko or whoever came up with these.

I don't want to do wild accusations, but based on your profile picture there is a good chance that I talk to the right person, or please cc the right person in regards of this. Personally, I don't care any orientations or fetishes so long nobody gets hurt. Some might argue when it comes to "moe" or "loli", I'm not here to argue, I literally don't care.

What I do actually care is: Imagine trying to explain a system administrator these domain entries in their firewall logs. It's just unacceptable in a business environment.

On top of that, I haven't enabled any telemetry, yet the language server tries to call home. I'm dissatisfied things like these popping up. That's highly unprofessional.

EDIT: I do realize that things have been removed in the latest version and coc-lua downloads a too old version via coc.nvim. You see bad decisions can have longer impacts.