microsoft / hummingbird

Hummingbird compiles trained ML models into tensor computation for faster inference.
MIT License
3.36k stars 279 forks source link

Performance benchmark as part of CI/CD #271

Open scnakandala opened 4 years ago

scnakandala commented 4 years ago

I was wondering whether we should have some kind of performance-based tests as part of the CI/CD pipelines. Any thoughts?

interesaaat commented 4 years ago

I was thinking about something similar today. It would be good to have some benchmarking refresh every time a new commit goes in. Perhaps we can store the benchmark numbers in the same location as the documentation.

The only problem I see is that if we add benchmarks as part of the ci/cd, the pipelines will take forever. Maybe we can have not a full benchmark as we have in the paper but just some performance test as you are suggesting. Like a couple of tests per operator or something like that.

ksaur commented 4 years ago

I love the idea of performance-based tests, because it would be great to be auto-notified if there is a regression (especially when new versions of pytorch, etc, change things unexpectedly).

But the CI/CD is already crawling. If we do add something to the pipeline, it would have to be minimal. Matteo's idea of a couple of tests only could work!

Regarding benchmarks in general, one thought that crossed my mind is to post the artifact eval scripts we are polishing up for OSDI. (Make them available but not part of pipeline directly)

interesaaat commented 4 years ago

Yes, we can put all the scripts into a benchmark directory. Do you know how other projects deals with performance regression in an (semi-)automatic way?

ksaur commented 4 years ago

Maybe we can use something from this github blog which shows runtime. It's not a perfect fit for what we're looking for, but it does provide some useful features.

interesaaat commented 4 years ago

I am starting with pushing the code for the benchmarks. The tasks to complete this PR will be: