-
Hi, thanks again for the nice paper!
The best reference model is GPT-J-6B, and from the code I see you set it to float16 precision (both revision and torch_dtype). If I initialize the model with fu…
-
command:
```
cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1 \
--model=gptj-99 \
--implementation=reference \
--framework=pytorch \
--category=edge \
--sc…
-
### System Info
- CPU architecture: x86_64
- Host memory: 256GB
- GPU
+ Name: NVIDIA A30
+ Memory: 24GB
- Libraries
+ TensorRT-LLM: v0.11.0
+ TensorRT: 10.1.0
+ CUDA: 12.6
+ NVID…
-
Thank you for the great work and release! Will this work with GPT-J (modulo minor edits to the code)?
-
Can i train gpt-j from scratch?
-
-
Hi,
GPT/GPT-J/GPT-Neox have similar nn architecures. In my view, the implementations of them in `src/fastertransformer/models` (`multi_gpu_gpt`, `gptj`,`gptneox`) are also very similar. I am wonde…
-
How can we use GPT-J for inference?
-
Any plan on adding support for togethercomputer/GPT-JT (https://huggingface.co/spaces/togethercomputer/GPT-JT).
Seems like the closest alternative to GPT-3. What do you think? I would love to help…
-
Someone fine-tuned GPT-J on the Alpaca instruction dataset using PETF:
```python
peft_model_id = "crumb/Instruct-GPT-J"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM…