-
Benchmark without BEAM is not as useful because it might overfit hand_coded_optimizations. We currently have llama, gpt2 and one matmul that uses BEAM. In the long run we will have default search, but…
-
因为网络原因,我本地下载了gpt2模型,并修改了原代码中GPT4TS.py 36行处:
`self.gpt2 = GPT2Model.from_pretrained('gpt2', output_attentions=True, output_hidden_states=True)`
为
`self.gpt2 = AutoModelForCausalLM.from_pretrained("/…
-
sh scripts/predict_finerune_gpt2.sh
2024-02-27 15:20:22.435851: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-p…
-
Hi,
Thanks for your amazing work!
I have a question about the type of GPT2. You have mentioned that you use gpt2 large as your langauge model (In section A.1), But I found your code actually load…
-
The main issue with this error is related to network connectivity. Specifically, the problem occurred when the program attempted to download the required file (`/gpt2/resolve/main/config.json`) from t…
-
### What happened?
I successfully imported and compiled GPT2 TF with IREE but when running it through the Python bindings, I get a segmentation fault:
```
collected 4 items / 3 deselected / 1 s…
-
Explonation:
Fails while running `train_gpt2.py` after successfully downloading preatined weights with error:
Error log
```Bash
python3 train_gpt2.py
using device: mps
loading weights from …
-
To validate the high-level IR execution framework, code for models other than gpt2, such as gpt2-medium, gpt2-large, and gpt2-xl, is also needed.
-
Just doing a bit of debugging.
"val loss" output nan, so I figured start there...
```
val loss 1 nan
val loss 2 nan
```
![2024-09-23 at 20 40 10 png](https://github.com/user-attachments/a…
-
Hi, I am trying out this great framework with a self trained GPT-2.
I wanted to use a custom trained model and the base model as tokenizer.
No matter if I use this approach or solely the base mo…