The new RFC introduces a new backend Android Neural Network API (NNAPI) for BYOC. It is a graph-level neural network inference API provided by the Android runtime. Prior to this RFC, TVM on Android mobile devices mainly relies on OpenCL for GPU acceleration. This RFC aims to add a new codegen and a runtime via the BYOC framework, which enables execution on custom accelerators from SoC vendors on mobile devices.
#17301 - [TE][CreatePrimFunc] Fix create reduce block with spatial iter dependent init value
#17284 - [Support] Fix the Read/Write of socket stream
#17302 - [Codegen][WebGPU] LetNode common subexpr override
#17246 - [Cleanup] Remove using namespace tvm::runtime from headers
#17278 - [Codegen] Emit tir::Let as var assignment explicitly
#17260 - [WINDOWS] Compiler options for non x86 targets
#17249 - [IR] Handle NaN in StructuralEqual and StructuralHash
#17257 - [FFI] Re-introduce the boxed primitive values
#17265 - [CompileBugfix][contrib] meet 'base64.h: No such file or directory' and '‘tvm::runtime::vm::AllocatorType’ has not been declared' while compiling
#17214 - Replacing unary ops with LookUpTable and Take op to improve performance
#17250 - [WebGPU] Fix unexpected device lost error when intentional dispose
Introduction
The TVM community has worked since the last release to deliver the following new exciting improvements!
The main tags are below (bold text is with lots of progress):
Please visit the full listing of commits for a complete view: v0.18.dev0...v0.18.0.rc0.
Community
RFCs
The new RFC introduces a new backend Android Neural Network API (NNAPI) for BYOC. It is a graph-level neural network inference API provided by the Android runtime. Prior to this RFC, TVM on Android mobile devices mainly relies on OpenCL for GPU acceleration. This RFC aims to add a new codegen and a runtime via the BYOC framework, which enables execution on custom accelerators from SoC vendors on mobile devices.
BYOC
BugFix
_convert_torch_tensor_to_relax()
where possiblelayer_norm
converter to supportimmutable_list
fornormalized_shape
tvm.
prefix from image name when./docker/build.sh
CI
20240917-153130-9f281758
unity/pr-head
stepDisco
Dlight
Docker
Docs
torch.export
insteamd offx.symbolic_trace
for tutorialFrontend
torch.export.ExportedProgram
in Relax PyTorch Frontendtorch.nn.functional.scaled_dot_product_attention
torch.ops.aten.sym_size.int
torch.nn.functional.conv*
aten::tile
torch.nn.functional.max_pool2d
Hexagon
LLVM
MetaSchedule
xgboost.rabit
withxgboost.collective
because it's deprecatedOpenCL & CLML
ROCm
Relax
_attention_sequence_prefill
function to …R.call_tir
is_group
argument in IPC AllReduceRelay
Runtime
TIR
is_vector
Method to DataType class and update usages across Codebasex==x
expressions for all dtypesTOPI
TVMScript
cuda & cutlass & tensorrt
web
Misc
using namespace tvm::runtime
from headerstir::Let
as var assignment explicitlydistutils.util.strtobool()
packaging
topython/gen_requirements.py
packaging.version.parse
instead ofdistutils.version.LooseVersion