gpt-neox Search Results

1000+ results
for gpt-neox

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

h2oai/h2ogpt #2

Cannot train 'EleutherAI/gpt-neox-20b' on 2x 24GB cards

Need to step up to larger models with permissive license. 30b Llama works, but can't be used. 6b is too small, bad results. So next better choice is gpt-neox-20b. this works: `CUDA_VISIBLE_DEVICES…

arnocandel updated 9 months ago
4
EleutherAI/gpt-neox #782

Cannot load the checkpoint

**Describe the bug** It generate the error when running the generate program **To Reproduce** Steps to reproduce the behavior: 1. run "./deepy.py generate.py ./configs/20B.yml -i prompt.txt -o sam…

jmlongriver12 updated 1 year ago
5
aws-neuron/transformers-neuronx #24

Running GPT-NeoX on inf2.24xlarge kills kernel

I'm running this sample code [https://github.com/aws-neuron/transformers-neuronx#hugging-face-generate-api-support] using GPT-NeoX on an inf2.24xlarge instance, but the model.generate method kills the…

aliseyfi updated 1 year ago
2
EleutherAI/math-lm #96

cannot convert raw llama weights to NeoX

Hello! Thanks for your great work, but I met some problems when trying to replicate the results. Specifically, I cannot find convert_raw_llama_weights_to_hf.py as depicted in [README.md](https://gi…

scikkk updated 9 months ago
2
triton-inference-server/tensorrtllm_backend #118

Feature Request: Llama-2 on Triton Inference Server with Ten…

Hi, I am able to reproduce building and running the model locally via TensorRT-LLM. I build using: ``` python3 build.py --model_dir /finetune-gpt-neox/models--meta-llama--Llama-2-7b-hf/snapsho…

rtalaricw updated 10 months ago
3
ROCm/omnitrace #336

torch.cuda.is_available() aborts after module loading omnitr…

Before loading omnitrace: ``` (gpt-neox-rocm5.6.0) langx@frontier07915:/lustre/orion/csc549/scratch/langx> python Python 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] on linux Type "help", "c…

R0n12 updated 5 months ago
1
axinc-ai/ailia-models #1143

ADD rinna/japanese-gpt-neox-3.6b-instruction-sft

https://huggingface.co/rinna/japanese-gpt-neox-3.6b-instruction-sft

kyakuno updated 1 year ago
5
microsoft/Megatron-DeepSpeed #233

Links in dataset/download_books.sh are broken.

```bash # dataset/download_books.sh wget https://the-eye.eu/public/AI/pile_neox/data/BookCorpusDataset_text_document.bin wget https://the-eye.eu/public/AI/pile_neox/data/BookCorpusDataset_text_docu…

Zeyu-ZEYU updated 7 months ago
7
huggingface/optimum #1915

Support for phi3-v Vision Model

### Feature request I encountered a KeyError while loading the phi3-v vision model into Optimum Huggingface. The error message states: ``` KeyError: 'phi3-v model type is not supported yet in Nor…

saaraahfar updated 2 months ago
1
deepseek-ai/DeepSeek-Coder #107

Construction of the FIM training data

Hi， dear： Thank you very much for your open source. Will the code of FIM dataset construction and training be made public? such as the number of lines or length of the code for Prefix, suffix, and…

shatealaboxiaowang updated 3 months ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for gpt-neox

1000+ results
for gpt-neox