Open gustavz opened 7 months ago
Hey @gustavz , thanks for reaching out. RAG is the scope of the next release introducing more examples and features related to that. One of them can be a support of the batch.
Probably one thing we should explore is latency because there are a few ways to run it:
I will keep you updated on the progress
In our Slack, you can engage in discussions, share your feedback and get latest updates.
Any more information? I work on RAG now, try to introduce guard
in RAG process and I adopt ray to do the concurrency.
but the result is not good enough...
Maybe something like this(add scan_batch
to prompt injection
):
def scan_batch(self, prompts: list[str]) -> list[(str, bool, float)]:
if len(prompts) == 0:
return []
prompt_batch: list[str] = []
for prompt in prompts:
prompt_batch.extend(self._match_type.get_inputs(prompt))
highest_score = 0.0
results_all = self._pipeline(prompt_batch)
result_batch: list[(bool, float)] = []
for result in results_all:
injection_score = round(
result["score"] if result["label"] == self._model["label"] else 1 - result["score"],
2,
)
if injection_score > highest_score:
highest_score = injection_score
if injection_score > self._threshold:
logger.warning(f"Detected prompt injection with score: {injection_score}")
result_batch.append((False, calculate_risk_score(injection_score, self._threshold)))
else:
result_batch.append((True, 0.0))
return [(prompts[i],) + result_batch[i] for i in range(0, len(result_batch))]
Hey @vincent-pli ,
On one side, it makes sense to have prompts
as a list. However, we are planning to improve accuracy by sending chunks of prompts instead to the pipeline. It might make things a bit more complex.
In the next version, we are planning to kick off the refactoring of inputs and outputs.
Hi do you have support for scanning text / prompts in batches? I am thinking about something like presidios
BatchAnalyzerEngine
. Right now llm-guard can efficiently only be used for single prompts or outputs, but can not be used to scan whole datasets (e.g for RAG). Do you plan to add support for those use cases?