Open himynamesdave opened 2 weeks ago
txt2stix compares the relationships (json files) from all models, it only considers relationships found in >=2 models (if >= 2 models specified) or 1 model (if only 1 model specified)
What happens if 2 models return a pair but with different relationship_type?
Good question.
We don't want this process to be too rigid.
Maybe we need to introduce more steps to check doubt.
E.g. where there is not a consensus. Go back and check with all models>
e.g.
I have another analyst arguing that <XXX> is actually a <XXX> relationship_type?
I am not sure who is correct.
Can you please review the text again, and confirm your choice.
The above could be good for extractions too. e.g. when only one model reports an extraction, you can check if other models have missed it
Do you have any suggestions @fqrious ?
You know every subsequent calls to the API uses extra used_tokens + current_prompt_tokens
tokens.
When you reply to chatgpt (or any LLM), you're actually sending your entire chat history back to it.
What this means is that if there are 100 of these, it will, at the very least use 100x as many tokens as if we just queried once?
Oh, i didn't realise that! Maybe we try and box to have a max of 4x checks. The first prompt for extractions (then check), then the first prompt for relationships (then check)?
We could also try and tokenise some parameters (e.g. temperature) for user to tweak the responses
This is a nice article in the security domain we might be able to take some learnings from?
We currently support OpenAI.
We also only wait for one response and use that.
We can try and tune out hallucinations, but the reality is we really need to get the AIs to "check each others work".
We should also add support for
User can set the API keys in the env file, as they do for Open AI now.
We should also remove the model from the env file, and let user pass this as a flag
--ai_models
Which accepts a dictionary list;
openai:gpt-4o
, openai:gpt-4o-mini,
gemini-1_5-flash` ...User must input at least one. However, they can add as many as they want.
The more they add, the more accurate the output should become, as it will be an amalgamation of data returned from multiple models.
The pipeline should work like this
We should also consider enhancements since we built this to include things like OpenAI structured outputs
https://openai.com/index/introducing-structured-outputs-in-the-api/