-
Hi!
I did some benchmarking recently and found that the memory demand is not _quite_ independent of the depth, see Table 3 on the last page on https://arxiv.org/abs/2005.05220
My suspicion is that o…
-
Some ideas for figures to add to the PPT
- [ ] Linear regression, single-layer neural network
- [ ] Multilayer Perceptron with hidden layer
- [ ] Backpropagation
- [ ] Batch Normalization and al…
-
May I ask how to calculate the hessian of each layer on the popular llm models, such as llama. Or do you have some suggestions on the popular hessian calculation repo.
Thank you very much for you…
-
Hi, I am just trying the example provided (https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/examples/te_llama/tutorial_accelerate_hf_llama_with_te.html), with llama 2 model.
As it i…
-
# Title
NODEA - Neural Ordinary Differential Equations Anonymous
# Description
This sessions is supposed to be an introduction to neural ordinary differentiable equations (Neural ODEs). As on…
-
### Environment
- **qiskit-addon-obp version**: 0.1.0
- **Python version**: 3.12.1
- **Operating system**: Mac Sonoma 14.3.1 (23D60)
### What is happening and why is it wrong?
If I pass in a C…
-
Have you considered probabilistic programming for error propagation? Automatic differentiation (together with the delta method from statistics) is pretty nice, too -- and popular courtesy of backpropa…
-
I assumed @hasktorch wraps the extensive PyTorch API in some very smart Haskell type-level programming.
However, @arkadiuszbicz hypothesises that HaskTorch is using not PyTorch, but a CI library To…
-
Nice to see there's already an implementation of this!
I just stumbled across [tensorflow's "stop_gradient" function](https://www.tensorflow.org/api_docs/python/tf/stop_gradient). In the examples o…
-
In `Guided backpropagation`,
Error:
```
KeyError: "The name 'predictions_10/Softmax:0' refers to a Tensor which does not exist. The operation, 'predictions_10/Softmax', does not exist in the grap…