-
Hi @gstoica27,
How are you?
Can you please provide a graph example for stacked linear layers (i.e. MLP) with activation functions?
Graph sketch: Linear->ReLU->Linear->ReLU->Linear->ReLU (3 linear l…
-
### 🐛 Bug
Today when attempting to upload a LoRA-trained Llama 3.1 70B model (first time I've trained Llama 3.1), I hit the following during the eLoRA merge. Note I used the `cpu_shard` method to u…
-
I found that the scripts in GEMMA do not support GEMMA2. Is there any plan to add support for GEMMA2?
-
Thank you for your incredible work.
Now, I'm trying to run train code with RTX 3090 (24GB).
But, Even with batch size 1, I met out of memory exception, and this is really wierd...
The only diff…
-
Hi,
thanks for your nice repo!
Have you tested whether the CNNs are able to find winning tickets on Cifar10 and Cifar100?
I ran multiple experiments with most of the convolutional architecture…
-
`
2022-10-12 15:43:57.254005: W tensorflow/core/grappler/optimizers/data/slack.cc:103] Could not find a final `prefetch` in the input pipeline to which to introduce slack.
I1012 15:43:57.996680 1404…
-
Hi,
I am running TC-Beta VAE on my data and I changed my architecture to an MLP encoder and Decoder. But I am getting nan in the loss function. And it seems I am getting nans for log_importance_w…
-
I am 100% sure that I correctly setup conda env on my system.
Downloaded all models and give them paths as described.
However, can not run test.py and gradio demo.
log, code:
`(SUPIR) milan@milan-…
-
How can I use Causal for training on custom dataset? Is it possible to use it for video semantic segmentation??
-
I first add BLOOM_INFO in architecture.py :
```python
BLOOM_INFO = StaticTensorNames(
name="BloomForCausalLM",
pre_weight_names=["word_embeddings.weight"],
post_weight_names=["ln_f.we…