-
(and make sure scheduler is working?)
The point being to decrease variability of the more ambiguous tasks. Right now the model is really only learning holders very well.
... Ok "very well". …
-
Now that DyNet has switched to variational dropout, a dropout mask needs to be stored so that the same mask can be applied across all time steps. Unfortunately this means that an LSTM cannot be used w…
-
**Describe the bug**
I cannot quantize Mobilenetv3 from keras2 because the hard-swish activation fuction is implemented as a TFOpLambda.
**System information**
tensorflow version: 2.17
tf_ke…
-
I am using Lora to fine tune the text_comfig and view_comfig in the config, which are llama and clipuvisic_model. This is clearly not the same for my expected qwen and siglip-visionmendel. The outputs…
-
Hallo HENDRIX-ZT2
And a bucketload of thanks for giving the opportunity to install youer apps through python as much as a learning curve, didn't know until a couple of days ago that you could downl…
-
I have got errors for all three evaluations. Some small errors such as undefined variables and indentations have been solved by myself but I still stuck here with other errors.
####################…
-
Hello! I have an strange issue with SDXL Lora's train.
I've tried both the newest version of Kohya_ss 23.0.15 (clean install without pip cache) and the oldest version 22.6.2 (clean install, too). T…
-
For each configuration of the network, let the model be trained with 10 different parameters (10 iterations of the loop in rnnAproach) for 20 epochs with early stopping of 2.
Each configuration shou…
kren1 updated
9 years ago
-
When I use Zero3, in initializing the network, if my Llama is rewritten by inheritance as follows:
```
class FlashLlamaModel(LlamaModel):
def __init__(self, config: LlamaConfig):
supe…
-
## Summary
to_consistent实现流水并行时同一op的不同参数被放到不同的GPU上导致无法运行。
![image](https://user-images.githubusercontent.com/38416786/141926421-f91d3221-2faa-46e3-92d0-4b3cfa3995bd.png)
## Code to reproduce bu…