intel / neural-speed

An innovative library for efficient LLM inference via low-bit quantization
https://github.com/intel/neural-speed
Apache License 2.0
350 stars 38 forks source link

Xetla support 2024.2 #309

Closed sunjiweiswift closed 4 months ago

sunjiweiswift commented 4 months ago

Type of Change

feature or bug fix or documentation or others -- Update compiler API to 2024.2 API changed or not -- Not

Description

1 Update prefetch 1D API 2 Update atomic API 3 Update gather/scatter API 4 Update 1 fp16 load/store 5 Unify all length units to bytes instead of elems(common.hpp)

2D load/store/prefetch not updated, will be updated in subsequent PR

detail description Issues: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

airMeng commented 4 months ago

IPEX will only upgrade oneapi2024.2 after next week, let's hold on this

sunjiweiswift commented 4 months ago

https://github.com/intel/neural-speed/pull/320 sync Ipex(prefetch modify)