issues
search
google
/
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Other
1.89k
stars
376
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Copybara import of the project:
#7510
copybara-service[bot]
opened
6 hours ago
0
Copybara import of the project:
#7509
copybara-service[bot]
closed
7 hours ago
0
Fix suggestions for Hexagon to accommodate recent changes
#7508
ejparkqc
opened
10 hours ago
2
Initialize input, filter and bias for convoluion 2d tests
#7507
copybara-service[bot]
opened
10 hours ago
0
Enable 7x16 F32-GEMM for avx512
#7506
copybara-service[bot]
opened
12 hours ago
0
Add f16->qu8 vcvt microkernels
#7505
copybara-service[bot]
opened
15 hours ago
0
Changes to `batch_matrix_multiply_bench`:
#7504
copybara-service[bot]
closed
20 hours ago
0
QS8-PACKW AVX2 using vpmaddubsw
#7503
copybara-service[bot]
closed
6 hours ago
0
Fix `bazel-linux-aarch64-gcc13` workflow and resolve accompanying build errors.
#7502
copybara-service[bot]
closed
10 hours ago
0
Copybara import of the project:
#7501
copybara-service[bot]
opened
3 days ago
0
Convolution bias does not need to be explicitly converted to `f16`.
#7500
copybara-service[bot]
closed
3 days ago
0
Fix `bazel-linux-aarch64-gcc13` workflow.
#7499
copybara-service[bot]
opened
3 days ago
0
Handle `f16` GEMM weights and biases when converting to `f16`.
#7498
copybara-service[bot]
closed
3 days ago
0
Optimize QS8 GIO packing using AVXVNNI instruction
#7497
xujuntwt95329
opened
3 days ago
0
Copybara import of the project:
#7496
copybara-service[bot]
closed
4 days ago
0
Copybara import of the project:
#7495
copybara-service[bot]
opened
4 days ago
1
Improve confusing and less accurate computation of quantization parameters
#7494
copybara-service[bot]
closed
3 days ago
0
F32-IGEMM AVX512 generate up to 16x64
#7493
copybara-service[bot]
opened
4 days ago
0
Remove unused header
#7492
copybara-service[bot]
closed
4 days ago
0
Speculative fix for #7489
#7491
copybara-service[bot]
closed
5 days ago
1
Add a builder for aarch64 under gcc-13 in addition to clang-18 (helps address https://github.com/google/XNNPACK/issues/7489)
#7490
copybara-service[bot]
closed
4 days ago
0
ARM build on gcc-13 failed with src/reference/unary-elementwise.cc:125:14: error: invalid ‘static_cast’ from type ‘xnn_bfloat16’ to type ‘_Float16’
#7489
xwang233
opened
5 days ago
2
Enable X32-GIO-PACKW AVX microkernel
#7486
copybara-service[bot]
closed
6 days ago
0
X32-GIO-PACKW for SSE41, Neon, WAsmSIMD and HVX
#7485
copybara-service[bot]
opened
6 days ago
0
Remove unneccessary ssevnni microkernels
#7484
copybara-service[bot]
closed
3 days ago
0
Allow calling `fully-connected` and `(depthwise|de)convolution` operators with `f16` weights.
#7483
copybara-service[bot]
closed
6 days ago
0
Increase XNN_EXTRA_QUANTIZATION_PARAMS to match taller kernels
#7482
copybara-service[bot]
closed
6 days ago
0
Convert inputs to VNNI kernels to unsigned int8 during the conversion from float.
#7481
copybara-service[bot]
opened
6 days ago
0
Prevent dequantizing/requantizing `f16` to `f32` and back.
#7480
copybara-service[bot]
closed
5 days ago
0
Add test and bench infrastructure and scalar microkernel for x8/qs8 GIO pack
#7479
xujuntwt95329
closed
4 days ago
0
Don't call `memcpy` for what is likely just a small number of bytes in `xnn_compute_hmp_grouped_gemm`.
#7478
copybara-service[bot]
closed
1 week ago
0
Run xnngen to generate microkernels
#7477
copybara-service[bot]
closed
1 week ago
0
Enable RNDNU16 microkernels for Aarch64 in gemm-config
#7476
copybara-service[bot]
closed
6 days ago
0
Add qu8-igemm-4x16-minmax-rndnu16-asm-aarch64-neon-mlal-lane microkernels
#7475
copybara-service[bot]
closed
1 week ago
0
In `static-slice` op, `sizes` should be a `size_t`.
#7474
copybara-service[bot]
closed
1 week ago
0
Some outputs of even split may be optimized away since they are unused.
#7473
copybara-service[bot]
closed
1 week ago
0
Add qu8-gemm-4x16-minmax-rndnu16-asm-aarch64-neon-mlal-lane microkernel.
#7472
copybara-service[bot]
closed
1 week ago
0
Slice only the tokens which are needed for the next stage of the LLM pipeline.
#7471
copybara-service[bot]
closed
6 days ago
0
Static slice takes negative offsets.
#7470
copybara-service[bot]
closed
1 week ago
0
Add rndnu16 version of xnn_qu8_igemm_minmax_rndnu_ukernel_1x16__neon_mlal_lane.
#7469
copybara-service[bot]
closed
1 week ago
0
Both RNDNU and RNDNU16 now do the same thing,
#7468
copybara-service[bot]
closed
1 week ago
0
Enable xnn_x32_packw_gemm_gio avx512 microkernels
#7467
copybara-service[bot]
closed
1 week ago
0
Update cpuinfo dependency
#7466
copybara-service[bot]
closed
1 week ago
0
Replace post increment + add with a single add
#7465
copybara-service[bot]
closed
1 week ago
0
Remove JIT generated references from NEON microkernels
#7464
copybara-service[bot]
closed
1 week ago
0
Run generator scripts to update source
#7463
copybara-service[bot]
closed
1 week ago
0
implement qs8 x8c4 pack using avxvnni
#7462
kylo5aby
opened
1 week ago
2
At present, integer types in XNNPACK can be ambiguous as to whether they are a quantized type or not. This change adds a templated wrapper to the C++ code, `quantized<T>`, to make it clear to both the reader and the type system whether or not an integer is 'plain' vs quantized.
#7461
copybara-service[bot]
closed
1 week ago
0
QS8-QC4W packw scalar microkernels
#7460
copybara-service[bot]
closed
13 hours ago
0
Fix bytes calculation in x8-packw benchmark
#7459
copybara-service[bot]
closed
1 week ago
0
Next