sxflynn / TeacherGPT

A proposed GPT chatbot for teachers that uses retrieval-augmentation to answer questions about their students.
MIT License
8 stars 1 forks source link

Create an obfuscation agent to protect sensitive student PII #11

Open sxflynn opened 4 months ago

sxflynn commented 4 months ago

In production, no real student names and other PII should be sent to 3rd party LLM API endpoints. Functionality needs to be built that will detect real names, map and replace the real name with Faker names, then filter the LLM response and use the map to re-insert the real name into the locally rendered response.

Here is a simplified workflow of what happens now and what should happen.

Now

  1. User: "What homework does John need to complete tonight?"
  2. Application: Search database for students with the first name Johnny -> returns Johnny Jones and all info
  3. Application: Search homework API for Johnny Jones's homework assignments -> Returns Math and Reading assignment
  4. Application: Prompt LLM to answer the question and provide all of Johnny Jone's context.
  5. LLM Server -> Johnny Jones has Math and Reading homework tonight

Proposed enhancement

  1. User: "What homework does John need to complete tonight?"
  2. Application: Search database for students with the first name Johnny -> returns Johnny Jones and all info
  3. Application: Search homework API for Johnny Jones's homework assignments -> Returns Math and Reading assignment
  4. Application: Map Johnny Jones to Faker() created name Matthew McReed.
  5. Application: Prompt LLM to answer the question and provide all of Matthew McReed's context.
  6. LLM Server -> Matthew McReed has Math and Reading homework tonight
  7. Application -> Matthew McReed Johnny Jones has Math and Reading homework tonight

A middleware-styled class needs to sit between the LLM call and the server retrieval logic, detecting potential PII through an NLP model like spaCy, maps PII to a Faker generated name, and then replacing the Faker name with PII once the response is returned from the server.