Open wenkokke opened 4 years ago
Alex, do you think you could take a look?
@wenkokke, in the meantime, you might try the (currently undocumented) CLI @bagnalla built, by doing:
> cd NNCert
> python3 cli.py
The CLI's dependencies are listed in NNCert/cli_deps.txt
(though not with version numbers -- perhaps @bagnalla can add).
Using cli.py
works for me, using the following steps:
conda create --name NNCert python=3.5
conda activate NNCert
pip install tensorflow==1.6.1
pip install sklearn
pip instal gitpython
pip install inquirer
python cli.py
(from MLCert/NNCert/
)It was initially rather unclear to me that the output of cli.py
was a asking me to make a choice, rather than giving feedback on the progress.
coq_makefile
, at it seems like it doesn't process the files in the correct order w.r.t. dependencies? It seems to succeed on the second run. Will report back when it finishes.Extracting model...
Makefile:17: Makefile.conf: No such file or directory
COQDEP VFILES
coq_makefile -R ../ MLCert -arg '-w none' net.v out.v bitnet.v kernel.v print.v config.v -o Makefile
Warning: ../ (used in -R or -Q) is not a subdirectory of the current directory
COQC net.v
COQC out.v
File "./out.v", line 8, characters 19-25:
Error: Unable to locate library bitnet.
make[1]: *** [out.vo] Error 1
make: *** [all] Error 2
File "./empiricalloss.v", line 14, characters 19-25:
Error: Unable to locate library bitnet.
Extracting a Coq model for EMNIST, with 10.000 nodes in the hidden layer, does not seem to be practical. Extraction ran for two days, and memory usage went up to 42GB, before I cancelled it. If you have had the same experience, @ejgallego has offered to help you optimise the generated Coq code, and has offered some suggestions:
Thanks for the details, after a quick peek to MLCert I think it should be possible to optimize memory consumption one order of magnitude. I'd be happy to help with it, of course if you folks think it'd be worth the effort.
Main things to try would be: use a native float and vector repesentation, profile Coq extraction, see where it is not being eager enough.
I'm guessing that extraction might not be your priority, however, as the networks imported to Coq can already be run using the TensorFlow architecture, and extracting the verified code to Haskell may not offer any additional benefits.
Thanks for your help, Wen! It would certainly be interesting to do the port to native vectors and floats (Coq 8.11), as suggested by @ejgallego. We're not actively working on MLCert at the moment, however, so I can't promise this will be done soon (or at all).
Hi folks, indeed I believe it should possible to improve the efficiency of extraction quite a bit; just ping us or open an issue at https://github.com/coq/coq/issues and we'll be happy to try to help.
I'm currently trying to replicate the extraction from an EMNIST network to MLCert. This is as far as I've gotten trying to get the code in NNCert to work, patching things up as I go:
conda create --name NNCert python=3.5
conda activate NNCert
pip install tensorflow==1.6.1
MLCert/NNCert/tf/extract_emnist.py
:emnist_load_data()
toemnist_load_data(<path/to/emnist>)
<path/to/emnist>/emnist
python extract_emnist.py
(fromMLCert/NNCert/tf
)MLCert/NNCert/tf/dataset_params.py
:make_dataset
:MLCert/NNCert/tf/pca.py
:_, _, _, _, _, load_data = ..
to_, load_data = ..
python pca.py
(fromMLCert/NNCert/tf
)<path/to/emnist>/emnist/{train,test,validation}_reduced.pkl
to<path/to/emnist>/emnist/{train,test,validation}_pca.pkl
MLCert/NNCert/tf/Makefile
:train.py
tomain.py
eval.py
tomain.py
MLCert/NNCert/tf/main.py
:hidden_sizes = map(..)
tohidden_sizes = list(map(..))
make
(fromMLCert/NNCert/tf
):sob::sob::sob: