Closed siddhartha-gadgil closed 1 year ago
@0art0 The keywords need to be regenerated on the branch defdocs
to include "kind" from data/safe-prompts.json
on the same branch.
There is regression in some cases, probably because theorem types have more information than definition types. Before using this we may wish to have a penalty for kind not being "theorem"
Using both pairs and triples is now part of the code: de73787aa5aad4915f So only very routine stuff is needed before merging:
The routine stuff is done in 343e2b5. Only testing remains before merging
A lot has changed since the issue was made with the switch to OpenAI embeddings making a lot of the above obsolete. When we next test we should also test with mixed collection of prompts.
With OpenAI prompts it may be best to add to the loaded (and pickled) array a Boolean whether something is a Prop, and compute the embeddings for everything. Then the distance can give a penalty for not being a Prop, or simply exclude non-props.
docprompts.sc
we should not filter by "kind" but make this a field.parsed_thms.txt
usingdatagen
.elabdatagen
updateelab_prompts.txt
safe_prompts.json
.