ScottLogic / prompt-injection

Application which investigates defensive measures against prompt injection attacks on an LLM, with a focus on the exposure of external tools.
MIT License
16 stars 11 forks source link

Transformed blocked messages not showing correctly #744

Closed gsproston-scottlogic closed 8 months ago

gsproston-scottlogic commented 10 months ago

Bug report

Description

If a user message is both transformed and blocked, then the transformed message will not show at all after a refresh or after switching levels.

Reproduction steps

Steps to reproduce the behaviour:

  1. Go to sandbox mode
  2. Activate XML tagging
  3. Activate character limit defence
  4. Send a long chat message which will trigger the character limit defence
  5. Wait for the reply
  6. See that there's the original user message, the transformed user message, and the bot blocked message
  7. Refresh the page OR switch levels
  8. No transformed message

Expected behaviour

The transformed message is still shown after the refresh or switching levels.

Screenshots

Before refreshing or switching levels:

image

After:

image

Software (please complete the following information):

Additional context

Could be related to #741

Acceptance criteria

GIVEN a message has been transformed (e.g. the user activates xml tagging and sends a message) AND blocked WHEN the page is refreshed THEN the transformed message is still shown (below the original message)

GIVEN a message has been transformed (e.g. the user activates xml tagging and sends a message) AND blocked WHEN the user switches level and then back to the original level THEN the transformed message is still shown (below the original message)

chriswilty commented 9 months ago

Think this is still a problem after #705

pmarsh-scottlogic commented 9 months ago

moving this to in review, because I've accidentally found and fixed it in #803