Open chsasank opened 6 months ago
Hello, Thank you for expressing your interest in using the auto-tuner tool. The auto-tuner is able to run a diverse list of tiled GEMM configurations on the underlying hardware and give out information like average time spent per GEMM configuration or the average flops achieve per GEMM configuration for the same input matrix size. We can definitely try and help you with this error and see if we can replicate the same issue on our end. Could you please share the cmake command you used to build the library? Thanks.
The instructions to build are detailed in #498 (the same one you commented recently on)
Hello @chsasank , Sorry for taking so long. We confirm the issue with the auto-tuner and we will look in to it in the future.
Thanks! Looking forward to see this fixed.
Hi I have been trying to autotune so that I get good perf on Intel Arc 770. However after building autotune and running it I see no progress and ultimate it fails as following:
I restarted tuning with smaller numbers: .
/tune_nn 1024 1024 1024 4 strided
. Unfortunately if fails for that too with the same error.I wish the tuner showed some sort of progress or the results it is seeing on each kernel. Would've been more educational if not anything. Happy to send PRs that shows both progress and with optimized configurations if you can handhold me a bit :)