gptneox Search Results - Githubissues

212 results
for gptneox

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mlfoundations/open_lm #41

Documentation: competing frameworks

Stating why one should choose this framework instead of others (GPTneoX DS\megatron, accelerate+HF etc.) may ease the work when choosing to use this framework rather than another. (The timing vs ease…

borgr updated 2 weeks ago
3
adapter-hub/adapters #521

Adapter support for GPTNeoX

I have implemented adapter for GPTNeoX following the instructions in the documentation. It passed all tests but during the training of the language adapter, it trained the prediction head too. Do you …

ajesujoba updated 1 year ago
6
NVIDIA/FasterTransformer #713

GPTNeox decoding argumentation

Hello! I am using GPTNeox for decoding, and I need to pass in some parameters during forward propagation, such as length penalty and so on. I also use Transformers to call model.generate() for gener…

w775739733 updated 11 months ago
3
keras-team/keras-hub #1052

Add `GPTNeoX` Model

Pythia is a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. The model was developed with intention to facilitate research in ma…

shivance updated 1 year ago
1
NVIDIA/FasterTransformer #670

gptneox_example error

### Branch/Tag/Commit main ### Docker Image Version /docker/Dockerfile.torch ### GPU name A100 ### CUDA Driver 470.129.06 ### Reproduced Steps ```shell ./bin/gptneox_exampl…

DesperadoDQY updated 1 year ago
1
keras-team/keras-hub #1183

GPTNeoX generate won't work

Following up the discussion from #1110 `generate()` with GPTNeoXCausalLM with pythia-70m checkpoints is not working

shivance updated 5 months ago
9
huggingface/transformers #23814

Adding GPTNeoX (Tensorflow version)

### Model description Huggingface has GPTNeoX model by ElutherAI. It's a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available…

shivance updated 1 year ago
8
intel/neural-speed #287

developer_document.md need elaboration on determining buffer…

In the example for adding to gptneox_mem_req I see that n_layers comes from the num_hidden_layers in the config.json file, but where does the 512, 512, and 1024 come from? Maybe a comment in the docu…

hpcpony updated 3 months ago
1
intel-analytics/ipex-llm #8287

llm: when `n_threads` is set as default value, initiation pr…

During recent experiments, if `n_threads` is just set as `None`(the default value), initiation process of `Llama` and `Gptneox` may be stuck on some platforms. Currently, it's recommended to run: …

plusbang updated 1 year ago
4
databricks/megablocks #92

Seeking a good multi-node training config

Thanks for the excellent work. Following the comment in #59, I am trying to train `dmoe_760m` using 16 GPUs (2 nodes) by changing distributed arguments to set up for two nodes but it is very slow in t…

rpand002 updated 8 months ago
3

上一页 1...1 2 3 4 5 6 7...22 下一页

212 results for gptneox

212 results
for gptneox