issues
search
google
/
jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Apache License 2.0
19
stars
12
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add server tests
#142
bvrockwell
opened
3 days ago
0
Update benchmark command in README.md
#141
bhavya01
closed
4 days ago
0
add enable jax profiler to run_server
#140
bvrockwell
closed
6 days ago
0
Update README.md to state the limitation of accessing GCS when conver…
#139
wang2yn84
closed
1 week ago
0
Minor fixes to README
#138
wang2yn84
closed
1 week ago
0
Empty response returned for prompt responses when using run_server_with_ray.py and batch_size > 1
#137
richardsliu
opened
1 week ago
1
Add layer id in scope for each TransformerBlock layer
#136
FanhaiLu1
closed
1 week ago
0
Checkpoint conversion script breaks for meta-llama/llama-2-7b on HF
#135
vivianrwu
opened
1 week ago
0
prototyping better UX
#134
qihqi
opened
2 weeks ago
0
Add left aligned cache support.
#133
wang2yn84
closed
1 week ago
0
fix mixtral quantization scaler axis when dimension > 2
#132
sixiang-google
closed
2 weeks ago
0
Add test for Mixtral model.
#131
wang2yn84
closed
2 weeks ago
0
make sure GPU works
#130
qihqi
closed
2 weeks ago
0
Update README.md
#129
bhavya01
closed
2 weeks ago
0
Update README.md
#128
qihqi
closed
3 weeks ago
0
Update submodules, prepare for leasing v0.2.4
#127
qihqi
closed
3 weeks ago
1
Add lock in prefill and generate to prevent starvation
#126
FanhaiLu1
closed
3 weeks ago
1
Update summary.md
#125
qihqi
closed
2 weeks ago
1
Remove JSON config mangling for Gemma ckpt
#124
lsy323
closed
3 weeks ago
1
Add different token sampling algorithms to decoder.
#123
bvrockwell
closed
3 weeks ago
1
add script to isntall for GPU
#122
qihqi
closed
3 weeks ago
2
Fix convert_checkpoint.py for hf and gemma
#121
qihqi
closed
3 weeks ago
0
Mixtral enablement.
#120
wang2yn84
closed
3 weeks ago
1
Add guide on adding HF ckpt conversion support
#119
lsy323
closed
1 month ago
0
Support HF LLaMA ckpt conversion
#118
lsy323
closed
1 month ago
0
Integrate disaggregated serving with JetStream
#117
FanhaiLu1
closed
1 month ago
0
Fix conversion bug
#116
yeandy
closed
1 month ago
0
Bug in model conversion script
#115
yeandy
closed
1 month ago
2
Add for readme interleave multiple host with ray
#114
FanhaiLu1
closed
1 month ago
1
Metrics bug: server_lib should be config_lib
#113
Bslabe123
closed
1 month ago
0
Enable jax profiler server in run with ray
#112
FanhaiLu1
closed
1 month ago
0
Jetstream: 8128c8a -> v0.2.2
#111
Bslabe123
closed
1 month ago
0
Release JetStream v0.2.2
#110
JoeZijunZhou
closed
1 month ago
0
Add run_server with ray for interleave serving
#109
FanhaiLu1
closed
1 month ago
0
Update Jetstream commit id
#108
FanhaiLu1
closed
1 month ago
0
Return Tuple(interleaveEngList, prefillEngineList, decodeEngineList) in create ray engine
#107
FanhaiLu1
opened
1 month ago
0
Ray Disaggregated Serving MVP
#106
FanhaiLu1
closed
1 month ago
2
Add activation quantization support to per-channel quantized linear layers
#105
lsy323
closed
3 weeks ago
0
Fix convert script cannot generate bf16 weights
#104
lsy323
closed
1 month ago
0
Update run_interactive.py with finer control of profiler.
#103
wang2yn84
closed
1 month ago
0
Update run_server.py. metrics_server_config is not supported in JetStream[8128c8a] yet
#102
wang2yn84
closed
1 month ago
2
Add support for Llama3-70b
#101
bhavya01
closed
3 weeks ago
3
Fix ray conflict changes
#100
FanhaiLu1
closed
1 month ago
2
Pass metrics client config through to Jetstream
#99
Bslabe123
closed
1 month ago
1
Fix gemma model, enable_weight_quantization is available through quant_config.
#98
wang2yn84
closed
1 month ago
1
Update README.md, the quantize flag is no longer available, quantize_type assumes the role of the original flag.
#97
wang2yn84
closed
1 month ago
1
Fix flax and ray dependencies
#96
FanhaiLu1
closed
1 month ago
0
Fixes tests. Can now run on CPU by default.
#95
wang2yn84
closed
1 month ago
4
Add regression test to detect service broken and performance degradation
#94
FanhaiLu1
opened
1 month ago
0
Integrates ragged attention to JetStream Pytorch
#93
wang2yn84
closed
1 month ago
0
Next