flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving
https://flashinfer.ai
Apache License 2.0
1.48k stars 147 forks source link

feat: warmup for jit kernel tests #629

Closed yzh119 closed 4 days ago

yzh119 commented 4 days ago

Currently unittests are slow when using flashinfer jit because we only compile kernels the first time we run it, it's blocking and didn't compile multiple ops in parallel. This PR add a warmup pre-hook to kernel unittests, so that we compile all necessary kernels before running the unittests in JIT mode, which greatly accelerate the unittests.

This PR also fixes the several issues with #628 :

  1. using thread-safe make_dirs(..., exist_ok=True) instead of relying on os.path.exists
  2. change the signature of parallel_load_modules to lists of (jit_module_creation_func, args) instead of lambda function, because lambda function captures variable by ref instead of value, which may cause some unexpected errors.