Closed wickedfoo closed 6 days ago
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
I noticed this PR is very similar to #4014. Is this a duplicate of that?
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request was exported from Phabricator. Differential Revision: D65459723
This pull request has been merged in facebookresearch/faiss@eaab46c870a3e597833c648cab18b5de2147eb2f.
Summary:
This diff adds support for bfloat16 vector/query data types with the GPU brute-force k-nearest neighbor function (
bfKnn
).The change is largely just plumbing the new data type through the template hierarchy (so distances can be computed in bfloat16).
Of note, by design, all final distance results are produced in float32 regardless of input data type (float32, float16, bfloat16). This is because the true nearest neighbors in many data sets can often differ by only ~1000 float32 ULPs in terms of distance which will result in possible false equivalency. This seems to be one area where lossy compression/quantization thoughout does not work as well (and is also why
CUBLAS_MATH_DISALLOW_REDUCED_PRECISION_REDUCTION
is set inStandardGpuResources.cpp
. However, given that there is native bf16 x bf16 = fp32 tensor core support on Ampere+ architectures, the matrix multiplication itself shouldWARNING: The one thing this diff does not yet handle properly is header inclusion / compilation for GPUs older than Ampere. This will need to be fixed before landing (so that compiling with an older CUDA SDK or compiling for the Volta architecture will simply error out at runtime properly with lack of support, instead of failing to compile (?)
Differential Revision: D65459723