neuralmagic / nm-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://nm-vllm.readthedocs.io
Other
251 stars 10 forks source link

[ CI ] Fan Out Strategy #325

Closed robertgshaw2-neuralmagic closed 3 months ago

robertgshaw2-neuralmagic commented 4 months ago

SUMMARY:

ADVANTAGES:

DISADVANTAGES:

FOLLOW UP PR:

dbarbuzzi commented 4 months ago

Can we update the test job’s name property to include something dynamic that is relevant to that specific instance so they can be differentiated in the GitHub UI list (e.g., inputs.test_directory)? This has to happen at the job-level in the last workflow that is called (e.g., the TEST job in .github/workflows/nm-test.yml could have something like name: TEST (${{ inputs.test_directory }})).

Also, they all have separate test runs in Testmo; is that the desired result, or would we want to maintain the previous behavior of having them consolidated into a single run? If using a single run, we could still submit results individually since we're already submitting results as threads, which is appropriate for the new approach.

robertgshaw2-neuralmagic commented 4 months ago

Can we update the test job’s name property to include something dynamic that is relevant to that specific instance so they can be differentiated in the GitHub UI list (e.g., inputs.test_directory)? This has to happen at the job-level in the last workflow that is called (e.g., the TEST job in .github/workflows/nm-test.yml could have something like name: TEST (${{ inputs.test_directory }})).

Also, they all have separate test runs in Testmo; is that the desired result, or would we want to maintain the previous behavior of having them consolidated into a single run? If using a single run, we could still submit results individually since we're already submitting results as threads, which is appropriate for the new approach.

@dbarbuzzi

It would be better if these could all be part of a single run (and ideally if we could add the lm-eval tests to that run as well --- which are not currently tracked in testmo at all). Is this something you could take on?

I think we should do this as part of a separate PR though