gpt-neox Search Results

1000+ results
for gpt-neox

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #28908

Add MistralForQuestionAnswering

### Feature request Add a MistralForQuestionAnswering class to the [modeling_mistral.py](https://github.com/huggingface/transformers/blob/main/src/transformers/models/mistral/modeling_mistral.py) so …

nakranivaibhav updated 1 month ago
1
TransformerLensOrg/TransformerLens #104

Add mixed precision inference incl loading

Add the option to load models in bfloat16 and float16. Esp important for large models like GPT-J and GPT-NeoX. Ideally, load from HuggingFace in this low precision, do weight processing on the CPU,…

neelnanda-io updated 1 year ago
11
togethercomputer/OpenChatKit #92

RuntimeError: The size of tensor a (2048) must match the siz…

**Describe the bug** When run $python inference/bot.py --model togethercomputer/Pythia-Chat-Base-7B --retrieval it report a RuntimeError: The size of tensor a (2048) must match the size of tensor b…

lclfans updated 1 year ago
14
togethercomputer/OpenChatKit #98

torch.cuda.OutOfMemoryError: CUDA out of memory.

when i use this model ask some question while a moment later happens the memory is not enough WeTraceback (most recent call last): File "/export/openChatKit/openChatKit/inference/bot.py", l…

leojames updated 1 year ago
1
Zyphra/Megatron-LM #3

[BUG] Dataloader Overflow Errors

@eric-weiss-zyphra discovered that upstream Megatron-LM is still on the old dataloader scheme (as opposed to gpt-neox), leading to overflow errors like: ``` File "torch/utils/data/_utils/collate.p…

Quentin-Anthony updated 9 months ago
1
VHellendoorn/Code-LMs #27

Context prompt - example not working

I have configured this setup in a 4GPU machine. I have done the setup using docker image. I have received the Context prompt. When I feed it an example, it is falling apart. Can you please help me to…

pallabganai updated 2 years ago
2
vllm-project/vllm #6478

[Bug]: AttributeError: '_OpNamespace' '_C' object has no att…

### Your current environment ```text Versions of relevant libraries: [pip3] flashinfer==0.0.9+cu121torch2.3 [pip3] numpy==1.26.4 [pip3] nvidia-nccl-cu12==2.20.5 [pip3] sentence-transformers==3.0…

choco9966 updated 1 week ago
14
huggingface/optimum #1091

Gradients greatly change after `BetterTransformer.transform`…

### System Info ```shell optimum: 1.8.6 transformers: 4.29.2 ubuntu/CUDA 11.7/pytorch 2.1 ``` ### Who can help? @fxmarty @younesbelkada ### Information - [ ] The official example scripts - [X]…

lengstrom updated 1 year ago
14
OpenBioML/chemnlp #337

How efficient is 12-30B using DeepSpeed?

or do we need to also integrate aspects such as Megatron-LM or migrate fully to GPT-NeoX.

jackapbutler updated 1 year ago
1
Zyphra/BlackMamba #10

Is there a stop token?

Hi, I am able to run the 2.8B version. Here are some sample inputs/outputs I am able to get: ``` Input: the quick brown fox jumps over the lazy dog"I'm not sure if this is the best way to do thi…

w32zhong updated 5 months ago
3

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for gpt-neox

1000+ results
for gpt-neox