Thank you for publishing the impressive work, along with the source code. I am currently attempting to run the latency benchmark by following the instructions in the README, but end up with some errors.
I found that in the DejaVu/Dejavu/benchmarks/benchmark_generation_opt.py file, the model parameter is loaded from a single file named "full.pt," which differs from the approach used in the accuracy benchmark (using files like pytorch_0.pt, pytorch_1.pt, and so on).
After completing the accuracy benchmark (perplexity on c4), I couldn't locate the "full.pt" file. Is there perhaps an instruction for generating the "full.pt" file that I may have missed? Your guidance on this matter would be greatly appreciated.
Hi authors,
Thank you for publishing the impressive work, along with the source code. I am currently attempting to run the latency benchmark by following the instructions in the README, but end up with some errors.
I found that in the DejaVu/Dejavu/benchmarks/benchmark_generation_opt.py file, the model parameter is loaded from a single file named "full.pt," which differs from the approach used in the accuracy benchmark (using files like pytorch_0.pt, pytorch_1.pt, and so on).
After completing the accuracy benchmark (perplexity on c4), I couldn't locate the "full.pt" file. Is there perhaps an instruction for generating the "full.pt" file that I may have missed? Your guidance on this matter would be greatly appreciated.