gradient-activation Search Results

1000+ results
for gradient-activation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

shap/shap #1039

SHAP Values in a neural network with an Embedding layer

Hi, I'm trying to get the SHAP Values from the following neural network: ``` model_ser = Sequential() model_ser.add(Embedding(input_dim=vocabulary_size, output_dim=embedding_size, input_length=…

lrdsouza updated 12 months ago
5
tensorflow/tensorflow #51818

Low performance when using persistent mode GradientTape with…

**System information** - Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04 …

AlexFuster updated 6 days ago
5
rsachetto/MonoAlg3D_C #67

Incorrect Stimulus Loaded

Hi! Sorry for another question! When executing the batch script, all iterations other than the first one only contains the last stimulus: ``` Stimulus name: stim_plain [stim] configuration: […

1592800-ox updated 6 months ago
10
xijiu9/Train_Transformers_with_INT4 #4

the paper mentioned that all linear ops are quantized into i…

Nice work in this paper, I want to know that: the paper mentioned that all linear ops are quantized into int4, what about **mat-multiply ops in the attention module?** Is the activation gradient in …

brisker updated 1 year ago
1
seung-lab/znn-release #54

Redundant application of mask

Mask is being applied to both output activation and target label in cost functions. Applying mask to the resulting gradient (https://github.com/seung-lab/znn-release/blob/master/python/train.py#L102) …

torms3 updated 8 years ago
2
ollama/ollama #6771

Inconsistent Responses from Identical Models

### What is the issue? I am new to Ollama and have noticed that when I ask a query using Ollama, the model's responses are quite poor. However, if I ask the same query using https://www.llama2.ai/, I…

wahidur028 updated 1 month ago
1
pytorch/ao #844

AO dtype composability tracker

As we start onboarding more dtypes we ideally want them to work in as many different situations as possible so opening this tracker and will update the table as things change. If I should be adding mo…

msaroufim updated 1 month ago
1
deeplearning4j/deeplearning4j #7056

Same output for every input

#### Issue Description Hi there! I use a Neural Network for Deep Q Learning. After training it gives me the same outputs for every input. My Input is an array with a size of 72 in which are eit…

Thrasher091 updated 5 years ago
2
TideDra/VL-RLHF #10

微调qwen爆内存

您好，使用原始代码在2张A100 80G上面微调qwen，显存占用两张卡上都只有919M，但是在数据加载过程中？内存占用一直在增加，直到180多G后内存爆了，程序终止。请问这个问题怎么解？训练log： ![image](https://github.com/TideDra/VL-RLHF/assets/36758049/09277b55-ea0a-4cfd-875b-792f457441a2…

delian11 updated 3 months ago
3
tensorflow/tensorflow #50796

ResourceVariable GC bug

Please make sure that this is a bug. As per our [GitHub Policy](https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests a…

fsx950223 updated 10 hours ago
4

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for gradient-activation

1000+ results
for gradient-activation