Open chunit-quic opened 2 days ago
Note: Links to docs will display an error until the docs builds have been completed.
There are 1 currently active SEVs. If your PR is affected, please view them below:
As of commit fcc10de4065dd2cd6c0d498d172597cd988fd677 with merge base 2d51e63d90746381fa5007246071fcac36aa8982 ():
* [Check Labels / Check labels](https://hud.pytorch.org/pr/pytorch/executorch/6983#33362267187) ([gh](https://github.com/pytorch/executorch/actions/runs/11966517275/job/33362267187)) * [Lint / lintrunner / linux-job](https://hud.pytorch.org/pr/pytorch/executorch/6983#33362267862) ([gh](https://github.com/pytorch/executorch/actions/runs/11966517466/job/33362267862)) `>>> Lint for install_requirements.py:` * [pull / test-llava-runner-linux / linux-job](https://hud.pytorch.org/pr/pytorch/executorch/6983#33362270757) ([gh](https://github.com/pytorch/executorch/actions/runs/11966517484/job/33362270757)) `test_prefill_logits`
This comment was automatically generated by Dr. CI and updates every 15 minutes.
release notes:
labelIf your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:
.
If not, please add the topic: not user facing
label.
To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"
For more information, see https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.
Hey do you mind sharing the command for AoT and runtime so I can try on my end?
Hey do you mind sharing the command for AoT and runtime so I can try on my end?
Sure! Change different mode (kv or bert )by set up the argument model_mode
.
python examples/qualcomm/oss_scripts/llama3_2/llama.py -a ${ARCHIVE}/ -b build-android -H ${HOST} -s ${DEVICE}-m ${SOC} --checkpoint Llama3.2-1B-Instruct/consolidated.00.pth --params Llama3.2-1B-Instruct/params.json --tokenizer_model Llama3.2-1B-Instruct/tokenizer.model --prompt "<|start_header_id|>" --ptq 16a4w --temperature 0 --model_size 1B --seq_len 16 --model_mode bert
Hey do you mind sharing the command for AoT and runtime so I can try on my end?
Sure! Change different mode (kv or bert )by set up the argument
model_mode
.python examples/qualcomm/oss_scripts/llama3_2/llama.py -a ${ARCHIVE}/ -b build-android -H ${HOST} -s ${DEVICE}-m ${SOC} --checkpoint Llama3.2-1B-Instruct/consolidated.00.pth --params Llama3.2-1B-Instruct/params.json --tokenizer_model Llama3.2-1B-Instruct/tokenizer.model --prompt "<|start_header_id|>" --ptq 16a4w --temperature 0 --model_size 1B --seq_len 16 --model_mode bert
Ah I see - do you mind rename bert mode to batch_prefill? The context is that bert isn't a common name..
Ah I see - do you mind rename bert mode to batch_prefill? The context is that bert isn't a common name..
No problem. let me change it
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
There are some errors here
executorch/examples/qualcomm/oss_scripts/llama3_2/runner/runner.cpp:50:7: error: field 'eval_mode_' will be initialized after field 'stats_' [-Werror,-Wreorder-ctor]
50 | eval_mode_(eval_mode),
| ^~~~~~~~~~~~~~~~~~~~~
| stats_({})
51 | stats_({}) {
| ~~~~~~~~~~
| eval_mode_(eval_mode)
executorch/examples/qualcomm/oss_scripts/llama3_2/runner/runner.cpp:50:7: error: field 'evalmode' will be initialized after field 'stats_' [-Werror,-Wreorder-ctor] 50 | evalmode(evalmode), | ^
~~~~| stats({}) 51 | stats_({}) { |~~| evalmode(eval_mode)
Thanks for pointing out. Fixed.
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.