dwhalen / holophrasm

Holophrasm: a neural Automated Theorem Prover for higher-order logic
MIT License
51 stars 13 forks source link

Holophrasm: a purely neural automated theorem prover for Metamath https://arxiv.org/abs/1608.02644

To use the program, start by downloading the release, which includes the binary files for the language model and the trained model weights. Then, from the root directory run

python run_script_ordered.py python write_all_proofs.py

run_script_ordered.py will attempt to search for proofs for all the Metamath propositions in the test set, starting with the (expected) easiest proofs first. Reasonable parameters for running the script are: timeout: 5 num_passes: 10000 search beam width: 20 hyp bonus: 0 Proofs are saved to searcher/proofs as soon as they are found. The proofs that I found during my search are in searcher/proofs_baseline.

write_all_proofs.py will generate a modified_set.mm module file, which consists of the proofs from set.mm, with the found proofs replaced. This can then be fed into Metamath and verified using the commands

READ modified_set.mm VERIFY PROOF *

There's a known bug involving dtrucor ignoring the disjointness conditions, which has been fixed with a temporary hack. In case the bug applies more generally, you should not assume any proof is valid until metamath verifies modified_set.mm.

#############################################

If you want to train from scratch, you can do the following. I recommend having at least 64GB of RAM.

Step 0: (OS X only) deal with Accelerate and forking bug ######### export VECLIB_MAXIMUM_THREADS=1 #########

Step 1: train gen and pred. Neither of the training scripts exit automatically, so you will need to keyboard interrupt them at some point, preferably after the training rate decays to a sufficiently low value. All of these scripts may be run in parallel. ######### python script_gen.py # Train the generative network python script_pred.py # Train the prediction network python save_language_model.py # Save a copy of the language model somewhere #########

Step 2: set up the files for the nn interface ######### mkdir searcher cp weights/gen/train.parameters searcher/gen.parameters cp weights/gen/train.weights searcher/gen.weights cp weights/pred/train.parameters searcher/pred.parameters cp weights/pred/train.weights searcher/pred.weights #########

Step 3: construct the payout data set and train the payout network The first script takes days, but is highly parallelizable. Chunks may be run on different machines, and multiprocessing can be enabled by changing the "withpool.Pool(None)" to "withpool.Pool(n)", where n is the desired number of processes. WARNING: multiprocessing support may crash Windows machines. WARNING: if generate_payout_data_script.py does not run correctly, it may leave files payoutdata* of size 0. These need to be deleted before rerunning the script. ######### python generate_payout_data_script.py python script_payout.py cp weights/pred/train.parameters searcher/payout.parameters cp weights/pred/train.weights searcher/payout.weights #########

Step 5: run the proof search (hyp bonus should be set to 0.0 for consistency with the paper) The search will be much faster if you set "multi=True" in the searcher.run call, but this may break on Windows. ######### python run_script_ordered.py #########

Step 6: verify the proofs ######### PYTHON CODE import write_proof write_proof.reset() write_proof.write_all_known_proofs() #########

Step 7: copy the modified_set.mm into the metamath folder and use the following commands to verify the proofs ######### METAMATH COMMANDS ERASE READ modified_set.mm VERIFY PROOF * #########