cp2k / dbcsr

DBCSR: Distributed Block Compressed Sparse Row matrix library
https://cp2k.github.io/dbcsr/
GNU General Public License v2.0
135 stars 46 forks source link

Merged tuned parameters from V100 into A100-parameters #656

Closed hfp closed 1 year ago

hfp commented 1 year ago

This complements https://github.com/cp2k/cp2k/pull/2655.

alazzaro commented 1 year ago

No, this is not actually what the autotuning was supposed to be. Merging V100 in A100 means that people will not run autotuning on their side, which is the entire idea behind all the procedure and well documented. I would propose to follow another solution, i.e. provide a generic kernel for all not existing ones and then make a warning "kernel not found, please autotune it".

hfp commented 1 year ago

No, this is not actually what the autotuning was supposed to be. Merging V100 in A100 means that people will not run autotuning on their side, which is the entire idea behind all the procedure and well documented. I would propose to follow another solution, i.e. provide a generic kernel for all not existing ones and then make a warning "kernel not found, please autotune it".

Indeed, this does not contribute tuned parameters, it just reuses mostly predicted parameters from V100. I will close the PR. I may contribute A100 tuned parameters for the OpenCL backend.

hfp commented 1 year ago

Note: as far as CP2K goes (https://github.com/cp2k/cp2k/pull/2655), A100 systems used V100 parameters exclusively.