JuliaGPU / KernelAbstractions.jl

Heterogeneous programming in Julia
MIT License
377 stars 66 forks source link

How (if possible) can we get rid of SparseArrays and StaticArrays? #506

Open avik-pal opened 2 months ago

avik-pal commented 2 months ago
julia> @time_imports using KernelAbstractions
      9.8 ms  UnsafeAtomics
      6.4 ms  Atomix
               ┌ 0.8 ms SuiteSparse_jll.__init__() 
    116.4 ms  SuiteSparse_jll 96.92% compilation time
               ┌ 8.2 ms SparseArrays.CHOLMOD.__init__() 95.65% compilation time
    156.1 ms  SparseArrays 5.03% compilation time
      0.7 ms  StaticArraysCore
    182.0 ms  StaticArrays
      0.4 ms  Adapt
      0.2 ms  AdaptStaticArraysExt
      1.5 ms  CEnum
      0.2 ms  LazyArtifacts
               ┌ 1.8 ms LLVMExtra_jll.__init__() 
      2.6 ms  LLVMExtra_jll
               ┌ 0.2 ms LLVM.__init__() 
     31.2 ms  LLVM
      2.9 ms  UnsafeAtomicsLLVM
     12.1 ms  KernelAbstractions
      0.3 ms  Statistics → SparseArraysExt
      0.3 ms  StaticArrays → StaticArraysStatisticsExt

On 1.11 these are the 2 packages adding to the bulk of load times. From the code, I could see StaticArrays being used for CPU shared memory implementation. Not sure where SparseArrays are being used exactly. Is there a way we can move these 2 dependencies to extensions?

I can create a PR, but I need some help in figuring out where these packages are being used :sweat_smile: (and if at all this change is welcome)

vchuravy commented 2 months ago

SparseArrays can be moved to an extension, it was introduced in https://github.com/JuliaGPU/KernelAbstractions.jl/pull/269

StaticArrays is going to be much harder. We need a reliable implementation of a stack allocated array and StaticArrays is the closest thing we got and this is core functionality.

avik-pal commented 2 months ago

I will get the first part done then, since it is a sizeable part of the load time

avik-pal commented 2 months ago

For StaticArrays could we do the following: (once a breaking release is being tagged)

  1. Use StaticArraysCore for the MArray type.
  2. If users want to do operations on SharedMemory on CPU then they must load StaticArrays manually.