The vigil-server.py script is really overloaded with configuration parsing and setting up the scanners. I want to abstract this away so the API is less complicated and move towards Vigil being used like a Python library, instead of being reliant on the API server.
I've added a vigil/vigil.py script with a central Vigil() class. Now you just need to import that class and pass it a config file, and everything is handled. This also paves the way for creating more complex detection pipelines
from vigil.vigil import Vigil
vigil = Vigil.from_config('conf/openai.conf')
scan1 = vigil.input_scanner.perform_scan(
input_prompt="prompt goes here"
)
if 'Potential prompt injection detected' in scan1['messages']:
take_some_action()
vigil.output_scanner.perform_scan(
input_text="prompt goes here",
input_resp="response goes here"
)
canary_prompt = vigil.canary_tokens.add(
prompt=prompt,
always=always if always else False,
length=length if length else 16,
header=header if header else '<-@!-- {canary} --@!->',
)
canary_prompt = canary_prompt + user_prompt
llm_response = call_llm(canary_prompt)
result = vigil.canary_tokens.check(prompt=llm_response)
if not result:
# canary token not found in LLM response
# this interaction may be indicative of goal hijacking
# so lets add the detected user prompt to the vector db
# for future detections
result, ids = vigil.vector_db.add(prompt=user_prompt)
The
vigil-server.py
script is really overloaded with configuration parsing and setting up the scanners. I want to abstract this away so the API is less complicated and move towards Vigil being used like a Python library, instead of being reliant on the API server.I've added a
vigil/vigil.py
script with a centralVigil()
class. Now you just need to import that class and pass it a config file, and everything is handled. This also paves the way for creating more complex detection pipelines