dropout Search Results - Githubissues

1000+ results
for dropout

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pmhalvor/fgsa #12

Test BertHead with unique learning rates, but also dropout

(and make sure scheduler is working?) The point being to decrease variability of the more ambiguous tasks. Right now the model is really only learning holders very well. ... Ok "very well". …

pmhalvor updated 2 years ago
1
clab/dynet #1053

LSTMs cannot use different batch sizes in the same computati…

Now that DyNet has switched to variational dropout, a dropout mask needs to be stored so that the same mask can be applied across all time steps. Unfortunately this means that an LSTM cannot be used w…

mitchellstern updated 7 years ago
5
tensorflow/model-optimization #1145

TFOpLambda not supported in INT8 Quantization Aware Training…

**Describe the bug** I cannot quantize Mobilenetv3 from keras2 because the hard-swish activation fuction is implemented as a TFOpLambda. **System information** tensorflow version: 2.17 tf_ke…

pedrofrodenas updated 1 month ago
1
LLaVA-VL/LLaVA-NeXT #286

The weights fine tuned by the model Lora do not match the pr…

I am using Lora to fine tune the text_comfig and view_comfig in the config, which are llama and clipuvisic_model. This is clearly not the same for my expected qwen and siglip-visionmendel. The outputs…

haozhang1234 updated 1 week ago
2
HENDRIX-ZT2/pyaudiorestoration #17

pyaudiorestoration suite

Hallo HENDRIX-ZT2 And a bucketload of thanks for giving the opportunity to install youer apps through python as much as a learning curve, didn't know until a couple of days ago that you could downl…

nokkensnaut updated 1 year ago
20
jiasenlu/visDial.pytorch #7

errors during evaluation

I have got errors for all three evaluations. Some small errors such as undefined variables and indentations have been solved by myself but I still stuck here with other errors. ####################…

sizhangyu updated 6 years ago
5
bmaltais/kohya_ss #2152

Train SDXL Lora with overtraining artifacts

Hello! I have an strange issue with SDXL Lora's train. I've tried both the newest version of Kohya_ss 23.0.15 (clean install without pip cache) and the oldest version 22.6.2 (clean install, too). T…

KorDum updated 7 months ago
15
group-24/Palpitate #37

rnnAproach assesment

For each configuration of the network, let the model be trained with 10 different parameters (10 iterations of the loop in rnnAproach) for 20 epochs with early stopping of 2. Each configuration shou…

kren1 updated 9 years ago
1
microsoft/DeepSpeedExamples #742

Same model Llama 7B, why does zero3 initialize different par…

When I use Zero3, in initializing the network, if my Llama is rewritten by inheritance as follows: ``` class FlashLlamaModel(LlamaModel): def __init__(self, config: LlamaConfig): supe…

BaenRH updated 11 months ago
1
Oneflow-Inc/oneflow #6783

to_consistent pipeline model parallel bug

## Summary to_consistent实现流水并行时同一op的不同参数被放到不同的GPU上导致无法运行。 ![image](https://user-images.githubusercontent.com/38416786/141926421-f91d3221-2faa-46e3-92d0-4b3cfa3995bd.png) ## Code to reproduce bu…

player1321 updated 3 years ago
7

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for dropout

1000+ results
for dropout