-
Hi thanks for your great work. I have some question about your codes:
1. Which trained model from the original STEGO you have fine tune for your dataset?
2. Have you used semisupervised approach for…
-
@lucidrains
This is a issue I'm having a while, the cross-attention is very weak at the start of the sequence.
When the transformer starts with no tokens it will relay on the cross-attention but un…
-
-
I tried to run your code to reproduce Table7. But I got this result.
`run_all.sh`:
```bash
#!/bin/bash
arch=$1
mem_size=$2
dataset=$3
ntask=$4
offline_ep=$5
if [ -z "$arch" ] | [ -z "$mem…
-
I tried to add check output correctness when converting models, it met Runtime error that `mat1 and mat2 must have the same dtype`
How can I fix this issue?
```bash
python -m python_coreml_stab…
-
Thanks for your contribution to the community, I am really interested in your work. But I met some problems when I tried to reproduce the results in the paper, Hope you can give me some advice.
I …
-
### System Info
- `transformers` version: 4.34.0
- Platform: Linux-5.15.0-86-generic-x86_64-with-glibc2.31
- Python version: 3.11.6
- Huggingface_hub version: 0.17.3
- Safetensors version: 0.4.0
…
-
![144db84703e8ae74b1c77ef09d0deb8](https://github.com/bmaltais/kohya_ss/assets/118010803/057e97f1-d90d-4bf1-afe7-2d02c1836913)
I open block lr, even though I open the last time I start training scrip…
-
![image](https://user-images.githubusercontent.com/30712916/232783869-e86a4666-2da6-485a-b002-584c973e8e83.png)
In this code, you use Resnet34. so the result in paper is resnet34 or resnet32?
can yo…
-
Hi, I'm attempting to train an hourglass model and am getting the error below whenever I use more than one stack. Is there a workaround?
Thanks for your help!
Sam
### Training output
INFO:sl…