Chainlit / chainlit

Build Conversational AI in minutes ⚡️
https://docs.chainlit.io
Apache License 2.0
7.22k stars 954 forks source link

remove telemetry token from the code #1329

Open EBazarov opened 2 months ago

EBazarov commented 2 months ago

https://github.com/Chainlit/chainlit/blob/74636a990eb989068bfcb7a5b03122cc356cb10a/backend/chainlit/telemetry.py#L65

dokterbob commented 1 month ago

@EBazarov Wow, thanks for catching this! Considerable privacy leak, I'll prioritise this!

dokterbob commented 1 month ago

Looking at the related code, it seems no private data is logged. 😅

https://github.com/Chainlit/chainlit/blob/74636a990eb989068bfcb7a5b03122cc356cb10a/backend/chainlit/telemetry.py#L87

The telemetry is gathering just performance traces and takes care to obfuscate client's host names (single iteration of SHA256, perhaps we should use something like PBKDF2, scrypt, Argon2 or Bcrypt).

It is currently enabled by default in the config, but can easily be disabled: https://github.com/Chainlit/chainlit/blob/main/backend/chainlit/config.py#L56

As the underlying framework is OpenTelemetry (uptrace is just a wrapper/implementation of it), perhaps it makes sense to switch to the default OLTP exporter, setting default environment variables.

This would enable implementers to do their own telemetry, default to setting anonymous stats to LiteralAI while making it similarly easy to disable it.

Perhaps another feature would be to add an interactive prompt to chainlit init where the user is interactively asked about anonymous data collection, as is common in other FOSS packages.