langflow-ai / langflow

⛓️ Langflow is a visual framework for building multi-agent and RAG applications. It's open-source, Python-powered, fully customizable, model and vector store agnostic.
http://www.langflow.org
MIT License
19.94k stars 2.96k forks source link

[SECURITY][BUG] Change the way that the secrets are handled #1145

Closed mik0w closed 3 months ago

mik0w commented 7 months ago

Context Hello, I am working on some MLOps/LLMOps security research and I'm looking for a ways in which LLMOps software can leak user's 3rd party API keys - like OpenAI API keys or SERP API keys.

Unfortunately, Langflow - if deployed incorrectly - also leaks API keys. You'll find examples below. (I am not a Langflow user, although I use Langchain. I've found those exposed instances through Shodan. I know that Langflow can be configured properly and it's somehow user's fault if their keys get exposed, but in my opinion keys should be stored securely by default.

Describe the bug When the user is creating a component in a flow - let's say LLMs>ChatOpenAI - they're then asked to provide the OpenAI API Key. Once the flow configuration is saved, user can go back to the flow and view the OpenAI API key they've provided previously. Probably right now you think "Ok, it's not a big deal though", but I believe it is, as plenty of people use misconfigured instances of the Langflow (without properly configured authentication), so this way their OpenAI API keys, SERPAPI keys etc. are exposed to the Internet. The screenshots I attach come from some systems which are publicly exposed to the Internet.

Rn according to my Shodan dorks there's more than 30 miscofigured Langflow instances exposed to the Internet, from which I was able to scrap a few OpenAI API keys, SERP API keys etc. (of course I am not going to use them anywhere).

Browser and Version

To Reproduce Steps to reproduce the behavior:

  1. Go to any misconfigured Langflow instance (you can find them on Shodan.io withhttp.title:"langflow" dork). Or you can use your local Langflow instance without properly configured authentication.
  2. Click on any collection previously created by the user of the system. (or create a new collection)
  3. Scroll down to any component with field like "OpenAI API Key" (or create a new OpenAI component, enter the key and then refresh the website)
  4. Use the "eye" button to display an API key
  5. Key is leaked.

Screenshots This screenshot comes from a random misconfigured instance of Langflow that I've found online (and that of course means that the key belongs to some other user and I shouldn't be able to just access it):

bug

Suggested solution Once the keys are entered into the Langflow, you shouldn't have a possbility of displaying them (neither should the keys be returned from the backend to the frontend). If the user is allowed to use those keys in some external solutions, he should have a possibility of accessing those keys from the source system.

Additional context It is a bug that's pretty characteristic to LLMOps/MLOps software for some reason, I've found a plenty of similar bugs in other LLMOps/MLOps systems recently: https://hackstery.com/2023/10/13/no-one-is-prefect-is-your-mlops-infrastructure-leaking-secrets/

ogabrielluiz commented 7 months ago

Hey @mik0w

Thanks for the thorough report.

That is definitely something we are aware of.

The way it works at the moment, the API key does not come back from the backend. It is sent from the browser to the backend and used there.

Having a centralized API keys storage is in our roadmap and should be coming soon.

Since we have the CustomComponents we have to design around DX too which is one of the reasons it hasn't come out yet.

Again, I really appreciate your effort. Please, let us know if you find any other issues.

mik0w commented 7 months ago

The way it works at the moment, the API key does not come back from the backend. It is sent from the browser to the backend and used there.

Yup, but at the moment it is send to the backend, it should stay there and I shouldn't be able to just casually view other's API keys.

Here's an example of the request I sent to some totally random Langflow instance:

GET /api/v1/flows/ HTTP/1.1
Host: xxxxxx:3000
Accept: application/json, text/plain, */*
Authorization: Bearer eyxxxxxxxxxxmU
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.123 Safari/537.36
Referer: http://xxxxxx:3000/
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: access_tkn_lflw=eyJxxxxxx; refresh_tkn_lflw=auto
Connection: close

And here comes the response (I filtered it to make it shorter and I anonymized the key):

HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
date: Thu, 16 Nov 2023 17:52:37 GMT
server: uvicorn
content-length: 57204
content-type: application/json
connection: close

[...]
"openai_api_key":{"required":false,"placeholder":"","show":true,"multiline":false,

"value":"sk-6xxxxxxxxxxxxx0R",

"password":true,"name":"openai_api_key","display_name":"OpenAI API Key","advanced":false,"dynamic":false,"info":"","type":"str","list":false},
[...]

So, as you can see, the keys are being returned from the server, even though I believe there's no reason for that.

Anyway, thank you for your quick response :)