-
Can we train gpt2-xl on nanoGPT? If possible,where's its datasets?
-
Hi,
Thank you for releasing the Arena. Which model is `gpt2-chatbot`?
Thanks!
-
Thank you for this excellent implementation. I'd like to suggest an optimization that could significantly speed up inference and enable streaming output.
Currently, there are two GPT2 graphs:
1.…
-
i train the model using 2 nodes, and copy machine1's model files to the machine2's directory.
then i use
python deepspeed_to_megatron.py --input_folder $checkpoint --output_folder output --tar…
-
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
@torch.compile(backend="turbine_cpu")
def test_gpt2_demo():
tokenizer = AutoTokenizer.from_pretrained("gp…
-
1. There is no Moe Inference example in Example, even though the https://www.deepspeed.ai/tutorials/mixture-of-experts-inference/ blog provides the link to generate_text.sh, but it's a normal GPT2 mod…
-
NameError: name 'ChatData' is not defined
code below:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
#from ChatData import ChatData
from torch.optim import Adam
from torch.utils.da…
-
I'm building a CI to test some models on certain types of devices. I want get benchmark statistics like which model cases failed? which tests were skipped and why? These statistics will be used to gen…
-
In @chenmoneygithub scipy talk, the KerasNLP uses a custom preprocessor
```
custom_preprocessor = keras_nlp.models.GPT2CausalLMPreprocessor.from_preset(
"gpt2_base_en",
sequence_length=8…
-
We have added a `generate()` method to `GPT2CausalLM`, and we need a way to benchmark this API since performance is a key to text generation.
More details will be added soon.