onnx / turnkeyml

The no-code AI toolchain
Apache License 2.0
74 stars 16 forks source link

Add LLaMA2 models #130

Closed jeremyfowers closed 8 months ago

jeremyfowers commented 8 months ago

Adds the following LLaMA 2 models to the transformers corpus:

The models will run one token in prefill mode and one token in token generation (kv cache) mode.

Note that individual users need to request permissions to use LLaMA2 weights from Meta and then supply a path to those weights with --script-args="--pretrained --model_path PATH/TO/MODEL". All other users can deploy llama2_*.py without any script args to get random weights for analysis and benchmarking purposes.

This PR also includes a fix to common.build.get_shapes_and_dtypes() to support double-nested tuples/lists.