Add benchmark scripts for pyjuice. Include settings for both torch.compile enabled and disabled. Currently, only HCLT is benchmarked with different latent sizes.
The "compile" time currently measures the first pass. The time for building the circuit is not yet measured as structure learning is coupled in the same procedure. NEED to decide how we design this.
Another thing TODO is to double-check memory consumption for appropriate garbage collection.
Also, the global compilation does not work now as there're complicated backward procedures and hooks. (Assuming NO solution for this.)
Closes #2
Add benchmark scripts for
pyjuice
. Include settings for bothtorch.compile
enabled and disabled. Currently, only HCLT is benchmarked with different latent sizes.The "compile" time currently measures the first pass. The time for building the circuit is not yet measured as structure learning is coupled in the same procedure. NEED to decide how we design this.
Another thing TODO is to double-check memory consumption for appropriate garbage collection.
Also, the global compilation does not work now as there're complicated backward procedures and hooks. (Assuming NO solution for this.)