Open luan-xiaokun opened 8 months ago
Hi @luan-xiaokun, thank you for your attention to our work! Indeed all the vector stores are pre-initialized and stored in the data/initialize_skills/skill
directory. To customize the initial skill library, you may check the code in lego_prover/env/chroma_worker.py
.
Here is a simple example:
chroma_worker = ChromaWorker("path/to/save/library", resume=False) # set resume equals False will initialize a empty library
data = "Your new problem statement"
error, output = chroma_worker.valid_problem_add_text(data)
After the vector store is built, you may replace the ../../data/initialize_skills/skill
in line 40 with your own path/to/save/library
Hi @wiio12, thanks for your help! I followed your instructions, here is what I did and what I found:
lego_prover/env/chroma_worker.py
to ensure that we are building an empty skill library from scratch
https://github.com/wiio12/LEGO-Prover/blob/357672c7751cd0c84aff6bf72a3d1bf97614e81d/lego_prover/env/chroma_worker.py#L35-L40lego_prover/env/chroma_worker.py
to the following, where true_skill.json
only contains a trivial fact that True
holds.
if __name__ == "__main__":
ckpt_path = "my_checkpoints"
resume = False
chroma_worker = ChromaWorker(ckpt_path, resume)
print("Chroma worker is ready.")
with open("../../my_checkpoints/true_skill.json") as json_file:
action, data = json.load(json_file)
assert action == "problem_add_text"
error, output = chroma_worker.valid_problem_add_text(data)
print("error:", error)
print("output:", output)
my_checkpoints
.lego_prover/env/chroma_worker.py
to my checkpoints folder, so that the Chroma worker will use my custom skill library when proving things.After that, I found that my custom skill library located in my_checkpoints/skill
does not have a skills.json
file, resulting in an index out of range error in the evolver process (skill_manager.skills
is an empty dictionary since I don't have a skills.json
file in my_checkpoints/skill
) LEGO seems to be working despite the error, but I'm not sure if the Chroma worker is still working.
https://github.com/wiio12/LEGO-Prover/blob/357672c7751cd0c84aff6bf72a3d1bf97614e81d/lego_prover/evolver.py#L321-L326
If I copy the skills.json
file in data/initialize_skills/skill
to my skill library, it will complain that vectordb is not synced with skill.json, so it seems like that to create my own custom skill library, I still need to craft a skills.json
file, correct?
Hi @luan-xiaokun, what you have done is correct, and there are two ways of making this work:
As you said, you need to craft a skill.json
file, with some basic skills. following the format of the existing skills.json
file. I am not sure if only doing this will eliminate all the errors since other things in the skill library are not initialized.
A simpler way I can think of is this: since you only want to change the problem to solve with LEGO-Prover, you can only replace the valid_problem_vectordb
in data/initialize_skills/skill
with valid_problem_vectordb
you have crafted in my_checkpoints
. This way you won't need to craft the skills.json
file.
I might provide a script to craft the initial skill library (hopefully by the end of this week).
Thanks for your quick response @wiio12! Looking forward to your skill lib initialization script!
Hi, thanks for open releasing the code of LEGO-Prover. I notice that the problem vector store is initialized with miniF2F problem statements and is divided into two parts: test and validation. Does this mean if I want to use LEGO-Prover to prove things for my own project, I need to customize the problem vector store accordingly?
I tried replacing the validation set of miniF2F with my own samples and running LEGO-Prover with the miniF2F problem vector store, but the evolver complains that it cannot find the retrieved problem statement (which is a miniF2F problem of course). Can you describe how to customize the initial problem vector store? Thanks!