DArray: Remove the stage cache

In the past, the DArray defined various operators (like map, reduce, mul, etc.) to return lazy wrappers, which would then get "staged" later to actually materialize a new DArray. This was done to allow some optimizations across operations, but few were implemented, and that was removed to make writing DArray operations simpler.

A vestige of this old mechanism was the stage cache, which allowed for a DAG of DArray operations to exist, which would ensure that values referenced multiple times (such as when doing A = rand(Blocks(4,4), 16, 16); A * A) would only compute A once, and then reuse its result. It was fine for this to stick around, since it never had a chance to deduplicate such operations, as they're now immediately executed (which results in the same semantics anyway).

A problem with stage caching is that it uses a WeakKeyDict to reference submitted DArray operators, which is known to be slower to free referenced objects and can cause excessive GC pressure and OOM issues. This PR thus removes the stage cache to prevent this from happening anymore. Additionally, some changes are made to ensure that https://github.com/JuliaLang/julia/issues/40626 does not bite us.

JuliaParallel / Dagger.jl

DArray: Remove the stage cache #472