Open mhupfauer opened 7 months ago
@kenhuangus , you able to take this? tyia!
Yes, I will investigate and if needed I will incorprate this vector of attack using glitch token for RAG based LLM Apps.
@mhupfauer Thanks Markus Hupfauer for the contribution, I will incorpate the following text from Mark.
Description [...] An additional Denial of Service method involves glitch tokens — unique, problematic strings of characters that disrupt model processing, resulting in partial or complete failure to produce coherent responses. This vulnerability is magnified as RAGs increasingly source data from dynamic internal resources like collaboration tools and document management systems. Attackers can exploit this by inserting glitch tokens into these sources, thus trigger a Denial of Service by compromising the model's functionality. Common Examples of Vulnerability [...]
@kenhuangus Thanks for merging my proposal!
Thank you as well.
@kenhuangus: There was a slight copy-paste error I think. Example Attack Scenario
is now twice in the document. Not the entire chapter but the headline :)
Remember, an issue is not the place to ask questions. You can use our Slack channel for that, or you may want to start a discussion on the Discussion Board.
When reporting an issue, please be sure to include the following:
Steps to Reproduce
What happens?
2_0_vulns/LLM04_ModelDoS.md
does not refer to DoS through glitch tokens injected into the prompt from a RAG solution. If a malicious actor is able to introduce broadly relevant information to the RAG database which includes glitch tokens for the given model that is used they are able to effectively run a denial of service attack for all users of the LLM.The main issue is that there is no clear indication which token caused the model to glitch so there is no obvious way to automatically remediate such issues. Enterprises build large RAG databases from (mostly) user generated content (i.e. Confluence / SharePoint / ... ) and also update this content frequently. A malicious actor could therefore easily introduce new content to the RAG database, including but not limited to glitch tokens which effectively cause a Denial of Service situation for all end users of the application.
What were you expecting to happen?
…
Any logs, error output, etc?
Any other comments?
A "normal" glitch token attack doesn't pose a significant threat as it only renders the current user session / context unusable. However through a poisoned RAG database a malicious user can inject these tokens into some/many/most/ if not all conversations, thus causing a broad service outage.
Sources talking about the issue
What versions of hardware and software are you using?
Operating System: … Browser: …