-
Hi, thanks for this amazing project. I was trying to finetune the lora model for Llama3.2 Vision which works fine and saved a adapter_0.pt; Then I wanted to use this adapter checkpoint for inference i…
-
### 1. System information
#### Converter
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04.2 LTS
- TensorFlow installation (pip package or built from source): pip package (p…
-
mosaic으로 offline augmentation한 데이터로 knet+upernet 학습시킬 때 loss가 NaN으로 나오면서 학습이 터지는 현상이 있었습니다.
찾아보니 CNN+activation을 거치면 gradient가 급변하게 되는데 이때 loss 값이 크게 튀어서 나오는 현상이라고 합니다.
해결법은 모든 activation function…
-
Hi again. I finally found some time to continue with your book. This time I ran into a problem in chapters 10 and 12, where you have the policy and the actor-critic agents (same problem for both). Aft…
-
**Describe the bug**
I think there might be something wrong with the current LISA implementation. There is no difference in training loss, no matter how many layers are active.
Not using LMFlow bu…
-
Does the Binarize() function use STE?
I haven't seen the STE algorithm in this whole project.
-
### Issue Description
I'm training a deep neural network predicting if two records refer to the same entity using LSTM layers. I can't get SHAP to work on the LSTM model, but it does provide values o…
-
Hi,
I am attempting to implement Conservative LRP rules as part of #184 since I need it for a university project. I am running into certain issues with the classifier heads and was hoping you could…
-
The NN with erf function output activation can occassionally output way beyond the boundary [-1,1]:
```
from jax import random
from neural_tangents import stax
import neural_tangents as nt
impo…
-
Hey John! Here's the curriculum that I've worked on in the past. It's a bit less focused on language models as a sole topic, and more on modern ML from a broad perspective.
- Essential Concepts of …
zmaas updated
1 month ago