-
# Batch_input and elapsed time per iteration slow down during model training
![微信图片编辑_20240629150957](https://github.com/EleutherAI/gpt-neox/assets/140717408/dae875c7-c01f-47e0-8767-aa8fe53cd476)
…
-
个人感觉全参数FT还是会比LoRA这种Adapter的效果要好的,那为什么LOMO没有火起来呢?个人已经试过2张24GB的显卡用LOMO FT一个7B的BLOOM,感觉整体流程还蛮丝滑的,为什么在各个平台搜不到太多用LOMO的人呢,好奇怪。
-
Hi,
Forgive me if this is elementary. I want to compute reaction rates between an open and closed ring in a molecule.
If we take a z-matrix of the input and output
And pass it throug…
-
In layers.py
```python
def conv2d(inputs, num_outputs, kernel_size, stride,
layer_dict={}, activation_fn=None,
#weights_initializer=tf.random_normal_initializer(0, 0.001),
…
-
Hi I'm trying this visualisation library and ran on a simple MNIST network.
I'm comparing activation maximisation used here from the the one described here: https://blog.keras.io/how-convolutional-…
-
### Description
I'm experiencing a significant performance difference when using `jax.value_and_grad` with different `argnums` values. Specifically, when setting `argnums=0`, the computation is abo…
-
I am proposing the addition of a new method to our model class, designed to apply constraints to predictions to ensure that the values fall within specified bounds. This functionality would be useful …
-
How to define the maximally activate some neuron exactly ?
-
Hi!
It looks like `compile()` ignores an _optimizer_ argument when compiling/training a custom model.
When i try this code:
`model %>% compile(optimizer = optimizer_rmsprop())` (766th row in the bo…
-
Hello, First thank you for this amazing post.
I tried to modify the model, to separate the evolution form the representation,
meaning that I have a function that evolve the state and at the end a fu…