Possible Leakage of Private Info in Semantic Cache Requests

Dear GPTCache Team, we are a security research group. We've used GPTCache for a while and impressed by its design and speed, but as we studied further, more concerns about the security of GPTCache has arosen. Recently, we found the semantic differences introduced from the private attributes are significant enough to be recognized, therefore, some private info can be inferred by the timing differences returned by the GPTCache system. Assmue the attacker has already guessed the private info right, we found the accuracy of identifying cache hits and misses is 85.3%, with an FPR of 3.2% in a single trial.

Description

GPTCache adopts semantic similarity to judge whether an up-coming request has a semantic neighbor in its dataset (i.e. cache store and vector store), and can return the former answer stored in dataset if the similarity exceeds the threshold. However, private info like the name, email addresses, phone numbers etc in the request can greatly change the semantics. Given two different sentences A and B as requests, A is from the victim, while B is from the attacker: If the private info is the same, while other components of the request is different but semantically similar, chances are that the similarity between these two sentences still exceeds the threshold. Therefore, when B is sent after A, the GPTCache returns the same answer to the attacker at an instant. The hit/miss timing differences are sufficiently pronounced for the attacker to ascertain the accuracy of his guesses.

Threat model

We assume an application is established via LangChain framework, using GPTCache as the backend. The application is accessible to both the victim and the attacker. Requests of different users will be turned into embedding forms and stored in the same dataset. We use ONNX to calculate the semantic similarity between sentences, and assume the victim might ask the same question in different forms.

Environment

GPTCache: v0.1.43

Attack steps

Select a Template which might include private info. Info in brackets [ ] are treated as privacy.

A: Compose a meeting agenda for an interdisciplinary team discussing the treatment plan for [Name] with [medical condition].

Using the gpt-3.5-turbo to generate plenty of semantic similar templates while keeping the privacy info unchanged.

Formulate an agenda for an interdisciplinary team meeting regarding the treatment plan for [Name] with [medical condition].
...

Fill in the bracket parts of original template and the generated template with the detailed private info. Use the original sentence as the base, filter generated sentences that have similarity over threshold via ONNX.
```
Formulate an agenda for an interdisciplinary team meeting regarding the treatment plan for Tom with diabetes.
... etc.
```
Use part of the generated sentences above as the questions that the victim might ask, and the remaining as the sentences attacker have. We consider the questions that the vicim might ask as the positive group, and calculate the semantic similarity between the attack sentences and the positive group. On the other hand, we change the different privacy 'Name' and 'Medical Condition' variable to form negative group.
We draw the ROC curves to demostrate the semantic differences.
We use the orthogonal (therefore attack sentences won't interfere each other) attack sentences to calculate the TPR and NPR.

Results:

We use the template: Compose a meeting agenda for an interdisciplinary team discussing the treatment plan for [Name] with [medical condition]. to conduct our experiments.
- The 'name' and 'medical condition' in the negative group are both different from the positive group.
- Either the 'name' and 'medical condition' in the negative group is different from the positive group.

The above two graph shows the privacy has introduced significant enough semantics to be distinguished, therefore can be utilized to judge whether the guess is right or not.

Attacks For the positive group (When the template is semantic similar and the private info is equal), the following table shows the attack accuracy with different number of attack trails.

#Trials	TPR (%)	FPR (%)
1	85.3	3.2
2	91.9	3.3

Possible Mitigation:

Reject input with privacy: when some sensitive info is involved in the request, maybe we can refuse this request.
Anonymize the request before embedding: we can replace sensitive information with their anonymized identifiers. We've tested a little, we de-identified privacy attributes selectively, including name, IP address, phone number etc, so as not to remove any information that is necessary for LLMs. This raise the difficulty to conduct the above attacks, while the timing latency it adds is negligble.

Detailed information will be provided soon in our paper, looking forward to your reply!

zilliztech / GPTCache