omlins / CellArrays.jl

Arrays containing logical cells of small arrays or structs
BSD 3-Clause "New" or "Revised" License
12 stars 7 forks source link

Split into two packages with and without GPU dependencies #16

Closed mkitti closed 2 years ago

mkitti commented 2 years ago

The CellArray concept is a useful one even for just CPU based applications. It also potentially serves as a nice abstraction of chunked arrays as used in several storage types (HDF5, Zarr). It would be useful to adopt CellArrays without having to inherit the GPU dependency. Would it be possible to split the package into two pieces?

  1. A package, CellArrays.jl or BaseCellArrays.jl that does not depend on CUDA.jl or AMDGPU.jl
  2. A package GPUCellArrays.jl that does depend on CUDA.jl or AMDGPU.jl

One could use subdirectory packages to keep the packages in this repository.

omlins commented 2 years ago

Thanks @mkitti for your interest in CellArrays.jl. In all the packages we develop and use we employ the following approach to deal with conditional usage of GPU packages: https://discourse.julialang.org/t/how-to-add-cuda-as-optional-dependency/73077/7?u=samo Might this solution suit you as well?

Splitting packages into multiple brings considerable overhead and is therefore not a solution we would like to take.

mkitti commented 2 years ago

I think your proposed approach works better for terminal or near-terminal applications, but this would still complicate the situation for libraries. The dependency tree would quickly explode if many packages did this.

What is the overhead that you see in splitting packages?

omlins commented 2 years ago

I think your proposed approach works better for terminal or near-terminal applications, but this would still complicate the situation for libraries.

As noted in the link shared above, we are using this successfully in the package/library ImplicitGlobalGrid.jl .: https://github.com/eth-cscs/ImplicitGlobalGrid.jl You can find more information on the usage of CUDA.functional and conditional CUDA usage in the CUDA.jl doc: https://cuda.juliagpu.org/stable/installation/conditional/

The dependency tree would quickly explode if many packages did this.

This is only done for packages that allow to target different architectures - mostly GPUs so far (CUDA.jl, AMDGPU.jl, ...), meaning not for random packages and nor many. Moreover, these packages share a large amount infrastructure and dependencies (e.g. GPUCompiler.jl). From my perspective, this is the current best practice for dealing with GPU package dependencies within the Julia community and as a result, we will stick to it for all our packages.

mkitti commented 2 years ago

Splitting packages into multiple brings considerable overhead and is therefore not a solution we would like to take.

Could you explain what you mean by overhead here? As far as I can tell the proposed split would not need Requires.jl or anything of that nature. Other than moving code around to different files within the repository, the only "new" statement other than some module declarations would be using CellArraysCore.jl. At this point, I'm really just trying to understand your statement.

The complexity I'm trying to avoid already occurs upon adding CellArrays to my environment and before I've had the chance to check CUDA.functional or its AMDGPU.jl equivalent.

Invoking the following, creating a clean Julia depot and adding CellArrays, results in a 491 MiB directory, mostly due to artifacts. The machine I'm using happens to have a AMD GPU.

$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("CellArrays")'
$ du -hcs /tmp/tmp.emWOWXVxGL/*
436M    /tmp/tmp.emWOWXVxGL/artifacts
31M /tmp/tmp.emWOWXVxGL/compiled
32K /tmp/tmp.emWOWXVxGL/environments
20K /tmp/tmp.emWOWXVxGL/logs
19M /tmp/tmp.emWOWXVxGL/packages
5.0M    /tmp/tmp.emWOWXVxGL/registries
24K /tmp/tmp.emWOWXVxGL/scratchspaces
491M    total

Doing the same but adding StaticArrays.jl results in a 11 MiB directory.

$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("StaticArrays")'
$ du -hcs /tmp/tmp.fwHkTVKfqe/*
4.1M    /tmp/tmp.fwHkTVKfqe/compiled
16K /tmp/tmp.fwHkTVKfqe/environments
8.0K    /tmp/tmp.fwHkTVKfqe/logs
1.4M    /tmp/tmp.fwHkTVKfqe/packages
5.0M    /tmp/tmp.fwHkTVKfqe/registries
16K /tmp/tmp.fwHkTVKfqe/scratchspaces
11M total

I do not see how CUDA.functional or similar would address the 480 MB difference, but perhaps the AMD GPU is not quite as advanced or lazy in downloading artifacts?

mkitti commented 2 years ago

The main culprit here is definitely AMDGPU.jl:

$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("AMDGPU")'
$ du -hcs /tmp/tmp.Z8AJtoM41D/*
436M    /tmp/tmp.Z8AJtoM41D/artifacts
11M /tmp/tmp.Z8AJtoM41D/compiled
32K /tmp/tmp.Z8AJtoM41D/environments
20K /tmp/tmp.Z8AJtoM41D/logs
12M /tmp/tmp.Z8AJtoM41D/packages
5.0M    /tmp/tmp.Z8AJtoM41D/registries
24K /tmp/tmp.Z8AJtoM41D/scratchspaces
463M    total
$ JULIA_DEPOT_PATH=`mktemp -d` julia -e 'using Pkg; Pkg.add("CUDA")'
$ du -hcs /tmp/tmp.g8sFY4rzgm/*
6.1M    /tmp/tmp.g8sFY4rzgm/artifacts
24M /tmp/tmp.g8sFY4rzgm/compiled
24K /tmp/tmp.g8sFY4rzgm/environments
12K /tmp/tmp.g8sFY4rzgm/logs
14M /tmp/tmp.g8sFY4rzgm/packages
5.0M    /tmp/tmp.g8sFY4rzgm/registries
16K /tmp/tmp.g8sFY4rzgm/scratchspaces
48M total
omlins commented 2 years ago

Thanks for investigating this. As you found that it the AMDGPU.jl package that makes uses up most of the space, please open an issue there instead. I am closing this one therefore.

omlins commented 8 months ago

@mkitti : UPDATE: as we are moving to extensions with all our packages, we have removed the GPU dependencies here completely (there doesn't seem to be any good way currently to export type aliases from extensions). See here: https://github.com/omlins/CellArrays.jl/pull/26

This solves also the issue you brought up here completely.