google jetstream-pytorch issues

google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

Apache License 2.0

21 stars 12 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Update run_server.py. metrics_server_config is not supported in JetStream[8128c8a] yet

#102 wang2yn84 closed 1 month ago
2
Add support for Llama3-70b

#101 bhavya01 closed 1 month ago
3
Fix ray conflict changes

#100 FanhaiLu1 closed 1 month ago
2
Pass metrics client config through to Jetstream

#99 Bslabe123 closed 1 month ago
1
Fix gemma model, enable_weight_quantization is available through quant_config.

#98 wang2yn84 closed 1 month ago
1
Update README.md, the quantize flag is no longer available, quantize_type assumes the role of the original flag.

#97 wang2yn84 closed 1 month ago
1
Fix flax and ray dependencies

#96 FanhaiLu1 closed 1 month ago
0
Fixes tests. Can now run on CPU by default.

#95 wang2yn84 closed 1 month ago
4
Add regression test to detect service broken and performance degradation

#94 FanhaiLu1 opened 1 month ago
0
Integrates ragged attention to JetStream Pytorch

#93 wang2yn84 closed 1 month ago
0
Move flags in scripts to a common function

#92 lsy323 closed 2 months ago
0
Update README.md

#91 qihqi closed 2 months ago
0
Leverage tokens_utils to process result tokens

#90 FanhaiLu1 closed 2 months ago
0
Move deps to git submodule

#89 qihqi closed 2 months ago
0
Update version of jetstream; misc fixes

#88 qihqi closed 2 months ago
0
Update README.md

#87 JackCaoG closed 2 months ago
3
Fix sharding config file name bug

#86 FanhaiLu1 closed 2 months ago
0
Add Gemma 2b benchmark; fix a typo.

#85 qihqi closed 2 months ago
0
Enable Blockwise Int4 quantized linear layer

#84 lsy323 closed 1 month ago
1
Clean up flags

#83 qihqi closed 2 months ago
0
How to run benchmark on CloudTPU v4-8

#82 JackCaoG closed 2 months ago
7
Add Gemma 2b benchmark; fix a typo.

#81 qihqi closed 2 months ago
0
Add shard on batch mode. Als update version of torchxla2

#80 qihqi closed 2 months ago
0
Add llama-3 instructions to readme

#79 bhavya01 closed 2 months ago
0
Fix attention kernel of GQA use case

#78 lsy323 closed 2 months ago
0
Enable quantization for Gemma 7b

#77 qihqi closed 2 months ago
0
Add benchmark results

#76 qihqi closed 2 months ago
0
Enable Gemma 2B

#75 qihqi closed 2 months ago
0
Add gemma and update recent changes to multiple host

#74 FanhaiLu1 closed 2 months ago
0
Fix sharding config for quant

#73 lsy323 closed 2 months ago
0
Use GemmaAttention for Gemma

#72 qihqi closed 2 months ago
0
Support converting hf gemma weights

#71 lsy323 closed 2 months ago
6
Gemma sharding and test

#70 FanhaiLu1 closed 2 months ago
0
Add gemma support

#69 qihqi closed 2 months ago
0
Add prefill only benchmark for different token length

#68 FanhaiLu1 closed 2 months ago
0
Pick a slot from 0 to batch_size-1 during run_interactive.py

#67 bhavya01 closed 2 months ago
1
Add vscode to gitignore

#66 FanhaiLu1 closed 2 months ago
0
Refactor so that environment and engine

#65 qihqi closed 2 months ago
1
Support llama3

#64 bhavya01 closed 2 months ago
6
Add ray multiple host support

#63 FanhaiLu1 closed 2 months ago
4
Ignore duplicated lines check now

#62 FanhaiLu1 closed 2 months ago
0
Output all token text

#61 FanhaiLu1 closed 2 months ago
0
Comment broken function to unblock interactive run after JetStream api change

#60 FanhaiLu1 closed 2 months ago
0
Use property instead of function

#59 FanhaiLu1 closed 2 months ago
0
Enable pylint check

#58 FanhaiLu1 closed 2 months ago
0
Fix run_server lint check

#57 FanhaiLu1 closed 2 months ago
0
Fix run_interactive lint check

#56 FanhaiLu1 closed 2 months ago
0
Fix convert checkpoint lint check

#55 FanhaiLu1 closed 2 months ago
0
Add pylintrc and init in root dir

#54 FanhaiLu1 closed 2 months ago
0
Fix quantizaiton, jax lint and llama e2e

#53 FanhaiLu1 closed 2 months ago
0

Previous Next