Telemetry - Githubissues

lonix1 commented 1 year ago

The VSCode addon download page states this:

All your code stays local – the model runs right on your computer, so there’s no need to transmit code to a remote server for custom model training

But the docs,

State this:

We capture some anonymized usage and error-reporting data to help improve IntelliCode.

And this:

The extracted data is transmitted over HTTPS to the IntelliCode service. The service then uses machine-learning algorithms to train a model for your code.

So which is it?

This is an incredibly sensitive topic, so having different sources state different things is really worrisome. In our company, if there is even a off chance that some tool is leaking our proprietary code, then that tool is banned - presumably it's the same at every company not working on open source.

Please give us a full explanation, and peace of mind.

(PS: those links were copied from other issues in this repo. Please advise whether they still apply.)

drewbitt commented 1 year ago

This seems fine to me. It states code does not leave your machine. Usage data/telemetry does. That is not your code.

And for the latter, Team models is an entirely different Intellicode service. You should narrow down your questions to the Team models then. The main Intellicode telemetry docs do not contradict themselves.

github-actions[bot] commented 1 year ago

Automatically marked uncategorized issue as product feedback

vivlimmsft commented 7 months ago

And this:

The extracted data is transmitted over HTTPS to the IntelliCode service. The service then uses machine-learning algorithms to train a model for your code.

This part only applies to the 'team completion model/custom model' training feature that was in Visual Studio and is no longer operational. The data mentioned there was only extracted if you chose to train a model.

All your code stays local – the model runs right on your computer, so there’s no need to transmit code to a remote server for custom model training

I am not actually sure why the listing for IntelliCode for C# Dev Kit mentions custom models at all, model training has never been a feature of that extension and I can see how that would be confusing.

All of the IntelliCode extensions use static model files that are either directly bundled with the extension or are (currently) downloaded from our web service. Those models don't change at all during operation - they do not 'learn' from your code.

We capture some anonymized usage and error-reporting data to help improve IntelliCode.

This is accurate. To elaborate a little on this: the usage data never includes any part of your code, e.g. symbol names, just events like "a suggestion was shown", "a suggestion was accepted", "the extension started", "an exception was thrown inside of the extension when trying to show a suggestion".

Hopefully my answer helps provide some peace of mind. Sorry about the delayed response.

lonix1 commented 7 months ago

Thanks for that elaboration. It would be beneficial if the various docs were updated to give everyone peace of mind. I'm sure I'm not the only one who worries about such things.

MicrosoftDocs / intellicode

Telemetry #474