Closed mkitti closed 2 years ago
Thanks @mkitti for your interest in CellArrays.jl. In all the packages we develop and use we employ the following approach to deal with conditional usage of GPU packages: https://discourse.julialang.org/t/how-to-add-cuda-as-optional-dependency/73077/7?u=samo Might this solution suit you as well?
Splitting packages into multiple brings considerable overhead and is therefore not a solution we would like to take.
I think your proposed approach works better for terminal or near-terminal applications, but this would still complicate the situation for libraries. The dependency tree would quickly explode if many packages did this.
What is the overhead that you see in splitting packages?
I think your proposed approach works better for terminal or near-terminal applications, but this would still complicate the situation for libraries.
As noted in the link shared above, we are using this successfully in the package/library ImplicitGlobalGrid.jl .: https://github.com/eth-cscs/ImplicitGlobalGrid.jl
You can find more information on the usage of CUDA.functional
and conditional CUDA usage in the CUDA.jl doc: https://cuda.juliagpu.org/stable/installation/conditional/
The dependency tree would quickly explode if many packages did this.
This is only done for packages that allow to target different architectures - mostly GPUs so far (CUDA.jl, AMDGPU.jl, ...), meaning not for random packages and nor many. Moreover, these packages share a large amount infrastructure and dependencies (e.g. GPUCompiler.jl). From my perspective, this is the current best practice for dealing with GPU package dependencies within the Julia community and as a result, we will stick to it for all our packages.
Splitting packages into multiple brings considerable overhead and is therefore not a solution we would like to take.
Could you explain what you mean by overhead here? As far as I can tell the proposed split would not need Requires.jl or anything of that nature. Other than moving code around to different files within the repository, the only "new" statement other than some module declarations would be using CellArraysCore.jl
. At this point, I'm really just trying to understand your statement.
The complexity I'm trying to avoid already occurs upon adding CellArrays to my environment and before I've had the chance to check CUDA.functional
or its AMDGPU.jl
equivalent.
Invoking the following, creating a clean Julia depot and adding CellArrays, results in a 491 MiB directory, mostly due to artifacts. The machine I'm using happens to have a AMD GPU.
$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("CellArrays")'
$ du -hcs /tmp/tmp.emWOWXVxGL/*
436M /tmp/tmp.emWOWXVxGL/artifacts
31M /tmp/tmp.emWOWXVxGL/compiled
32K /tmp/tmp.emWOWXVxGL/environments
20K /tmp/tmp.emWOWXVxGL/logs
19M /tmp/tmp.emWOWXVxGL/packages
5.0M /tmp/tmp.emWOWXVxGL/registries
24K /tmp/tmp.emWOWXVxGL/scratchspaces
491M total
Doing the same but adding StaticArrays.jl results in a 11 MiB directory.
$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("StaticArrays")'
$ du -hcs /tmp/tmp.fwHkTVKfqe/*
4.1M /tmp/tmp.fwHkTVKfqe/compiled
16K /tmp/tmp.fwHkTVKfqe/environments
8.0K /tmp/tmp.fwHkTVKfqe/logs
1.4M /tmp/tmp.fwHkTVKfqe/packages
5.0M /tmp/tmp.fwHkTVKfqe/registries
16K /tmp/tmp.fwHkTVKfqe/scratchspaces
11M total
I do not see how CUDA.functional
or similar would address the 480 MB difference, but perhaps the AMD GPU is not quite as advanced or lazy in downloading artifacts?
The main culprit here is definitely AMDGPU.jl:
$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("AMDGPU")'
$ du -hcs /tmp/tmp.Z8AJtoM41D/*
436M /tmp/tmp.Z8AJtoM41D/artifacts
11M /tmp/tmp.Z8AJtoM41D/compiled
32K /tmp/tmp.Z8AJtoM41D/environments
20K /tmp/tmp.Z8AJtoM41D/logs
12M /tmp/tmp.Z8AJtoM41D/packages
5.0M /tmp/tmp.Z8AJtoM41D/registries
24K /tmp/tmp.Z8AJtoM41D/scratchspaces
463M total
$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("CUDA")'
$ du -hcs /tmp/tmp.g8sFY4rzgm/*
6.1M /tmp/tmp.g8sFY4rzgm/artifacts
24M /tmp/tmp.g8sFY4rzgm/compiled
24K /tmp/tmp.g8sFY4rzgm/environments
12K /tmp/tmp.g8sFY4rzgm/logs
14M /tmp/tmp.g8sFY4rzgm/packages
5.0M /tmp/tmp.g8sFY4rzgm/registries
16K /tmp/tmp.g8sFY4rzgm/scratchspaces
48M total
Thanks for investigating this. As you found that it the AMDGPU.jl package that makes uses up most of the space, please open an issue there instead. I am closing this one therefore.
@mkitti : UPDATE: as we are moving to extensions with all our packages, we have removed the GPU dependencies here completely (there doesn't seem to be any good way currently to export type aliases from extensions). See here: https://github.com/omlins/CellArrays.jl/pull/26
This solves also the issue you brought up here completely.
The
CellArray
concept is a useful one even for just CPU based applications. It also potentially serves as a nice abstraction of chunked arrays as used in several storage types (HDF5, Zarr). It would be useful to adopt CellArrays without having to inherit the GPU dependency. Would it be possible to split the package into two pieces?CellArrays.jl
orBaseCellArrays.jl
that does not depend onCUDA.jl
orAMDGPU.jl
GPUCellArrays.jl
that does depend onCUDA.jl
orAMDGPU.jl
One could use subdirectory packages to keep the packages in this repository.