johnsmith0031 / alpaca_lora_4bit

MIT License
533 stars 84 forks source link

Fixed triton `do_bench` non-None quantiles/percentiles default (issue 101) #102

Closed alex4321 closed 1 year ago

alex4321 commented 1 year ago

Hi. I made a bugfix for the issue I mentioned here: https://github.com/johnsmith0031/alpaca_lora_4bit/issues/101

TL;DR; despite the latest github triton do_bench function code have quantiles/percentiles setted to None by default - the versions which is currently available through pypi (up to 2.0.0.post1) have percentiles=(0.5, 0.2, 0.8) which causes the following code to break:

    def _bench(self, *args, config, **meta):
            ...
        try:
            # In testings using only 40 reps seems to be close enough and it appears to be what PyTorch uses
            # PyTorch also sets fast_flush to True, but I didn't see any speedup so I'll leave the default
            return triton.testing.do_bench(kernel_call, rep=40)
        except triton.compiler.OutOfResources:
            return float('inf')
...
                timings = {config: self._bench(*args, config=config, **kwargs)
                            for config in pruned_configs}
                bench_end = time.time()
                self.bench_time = bench_end - bench_start
                self.cache[key] = builtins.min(timings, key=timings.get)

because instead of float numbers the timings dict is filled with tuples (for 0.5/0.2/0.8 percentile).

alex4321 commented 1 year ago

@winglian what do you think? I don't know is that reasonable at all to make such a patch if the new triton version are going to work correctly. But not the currently published ones.

winglian commented 1 year ago

Sounds good to me. It's a small enough fix that's a good workaround until triton is ready

alex4321 commented 1 year ago

Than I guess I may tag @johnsmith0031 to merge it into pip branch? (Don't know how github right system works with PRs when we're talking about a branches within one repository instead of forks)

johnsmith0031 commented 1 year ago

Thanks! I'll merge it into both main and pip branch.