hltcoe sandle issues - Githubissues

hltcoe / sandle

Run a large language modeling SANDbox in your Local Environment

Other

7 stars 1 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

WebSockets

#107 ccmaymay opened 1 year ago
0
NeMo

#106 ccmaymay closed 1 year ago
1
TorchInductor

#105 ccmaymay closed 1 year ago
1
HF inference endpoints

#104 ccmaymay closed 1 year ago
1
langchain

#103 ccmaymay opened 1 year ago
0
vLLM

#102 ccmaymay opened 1 year ago
0
Bright cluster

#101 ccmaymay closed 1 year ago
1
Run.ai

#100 ccmaymay closed 1 year ago
1
Vicuna

#99 ccmaymay opened 1 year ago
0
fast transformers

#98 ccmaymay closed 1 year ago
1
Organize & share software stack notes

#97 ccmaymay opened 1 year ago
0
FastChat

#96 ccmaymay opened 1 year ago
0
petals chat UI

#95 ccmaymay opened 1 year ago
0
HF chat UI

#94 ccmaymay opened 1 year ago
0
Comparison to Petals

#93 danyaljj opened 1 year ago
0
Adding, removing backends/backend nodes at runtime

#92 ccmaymay opened 1 year ago
0
Prompt tuning endpoint

#91 ccmaymay opened 1 year ago
0
Embeddings endpoint

#90 ccmaymay opened 1 year ago
0
XGLM

#89 ccmaymay opened 1 year ago
0
DeepSpeed

#88 ccmaymay opened 1 year ago
3
LLaMA backend timeout

#87 ccmaymay closed 1 year ago
0
LLaMA backend OOM after idling for a while, then trying to allocate >1 EB

#86 ccmaymay closed 1 year ago
2
Out of memory in some settings even when there should be plenty

#85 ccmaymay opened 1 year ago
0
Click

#84 ccmaymay closed 1 year ago
1
Torch Serve

#83 ccmaymay closed 1 year ago
2
HF Text Generation Inference

#82 ccmaymay opened 1 year ago
0
Ray Serve

#81 ccmaymay closed 1 year ago
1
Help Aleem get started

#80 ccmaymay opened 1 year ago
0
Triton Inference Server / FasterTransformer

#79 ccmaymay closed 1 year ago
1
Ease configuration

#78 ccmaymay closed 1 year ago
1
Add llama support

#77 ccmaymay closed 1 year ago
1
CUDA error: peer mapping resources exhausted

#76 ccmaymay opened 1 year ago
0
Stop generation when stop sequence is generated.

#75 ccmaymay closed 2 years ago
0
Fix tokenized contraction input-output mismatch.

#74 ccmaymay closed 2 years ago
0
Stop generation at stop sequence instead of truncating after the fact

#73 ccmaymay closed 2 years ago
0
Debug sporadic CI failures

#72 ccmaymay closed 2 years ago
1
Remove github banner.

#71 ccmaymay closed 2 years ago
0
Optionally use bnb-int8 algorithm.

#70 ccmaymay closed 2 years ago
0
Add GPT-NeoX

#69 ccmaymay opened 2 years ago
0
nginx returns html on gateway timeout

#68 ccmaymay closed 2 years ago
0
Add GPT-J

#67 ccmaymay closed 2 years ago
0
FastAPI

#66 ccmaymay opened 2 years ago
0
Energon AI OPT server

#65 ccmaymay closed 1 year ago
1
Alpa OPT service

#64 ccmaymay closed 1 year ago
1
Defer to HF Hub for model list.

#63 ccmaymay closed 2 years ago
0
Defer to HF Hub for checking if model is supported

#62 ccmaymay closed 2 years ago
0
Handle prompt-too-long error

#61 ccmaymay opened 2 years ago
0
Explain authorized-users.txt in README.

#60 ccmaymay closed 2 years ago
0
Input prompt does not match output prompt when input contractions are already tokenized (ex. `mother 's`)

#59 ccmaymay closed 2 years ago
2
Investigate poor performance on large prompts

#58 ccmaymay opened 2 years ago
0