-
Hi, the following use-case of `mapreduce` doesnt work:
```julia
gradient(randn(10)) do x
y₀ = Float64[]
∑x = 0.0
ys = mapreduce(vcat, x, 1:length(x); init = y₀) do xᵢ…
-
Hi all,
I am currently experimenting with your provided code. Your plot indicating memory usage for the different batch sizes & max_length seems to fit perfectly for our setup for training. However…
-
**Checklist**
1. I have searched related issues but cannot get the expected help. ✅
2. I have read the FAQ documentation but cannot get the expected help. ✅
Hi!
Let's say there is a model th…
-
Common errors/issues that aren't explicitly checked for should now be listed here:
- user calls `make_learner()` instead of `make_learner_stack()` with multiple learners.
- Factor levels in shift…
-
**User Statement:**
As a developer, I need to discover and outline the necessary actions that need to be taken in order to support log accumulation.
**Details:**
Should begin by investigating L…
-
**User Statement:**
As a customer of vic, I don't want performance to suffer at the hand of improved logging.
**Details:**
"This is a mechanism that allows capture debug/trace level data at the …
-
I have 4GB memory GPU which can support at most batch size of 8 images but I want to train at least 16 images batch and some where on internet I heard the concept gradient accumulation biut don't know…
-
if batch_gpu < batch_size // num_gpus, the accumulated gradient should be normalized by (num_gpus * batch_gpu) // batch_size. The current accumulation implementation does not seem to be normalized, wh…
-
How does Cyrill actually calculate values from cone tracing?
Is it simple alpha compositing? (highly doubt)
Is it transmittance accumulation?
@nopjia should read some papers.
-