Open lambda7xx opened 1 year ago
What's the outcome of incr_decoding? Were you seeing the same error message?
What's the outcome of incr_decoding? Were you seeing the same error message?
the same error
No small speculative model registered yet, using incremental decoding.
*** stack smashing detected ***: terminated
no_offload.sh: line 68: 29973 Aborted (core dumped) ../build/inference/incr_decoding/incr_decoding -ll:gpu 4 -ll:fsize 14000 -ll:zsize 30000 -llm-model opt -llm-weight ../inference/weights/opt_6B_weights_half/ -llm-config ../inference/models/configs/opt_6B.json -tokenizer ../inference/tokenizer/ -prompt ../inference/prompt/test.json -output-file ../inference/output/incr_decoding_opt_6B_half-batchsize-$batch_size.txt
1c1
< token IDs: 0,25538,2211,25562,363,7952,292,9045,29891,29889,13,29896,29889,382,271,9045,29891,9687,29879,29889,13,29906,29889,1222,6269,895,25704,29889,13,29941,29889,3617,3307,8709,29889,13,29946,29889,4942,682,20947,310,4094,29889,13,29945,29889,399,1161,596,6567,4049,29889,13,29953,29889,399,1161,596,18655,1849,29889,13,29955,29889,399,1161,596,285,21211,29889,13,29947,29889,399,1161,596,27654,29889,13,29929,29889,399,1161,596,29808,29889,13,29896,29900,29889,399,1161,596,9427,29889,13,29896,29896,29889,399,1161,596,18423,29889,13,29896,29906,29889,399,1161,596,274,406,284,29889,13,29896,29941,29889,399,1161,596,27274,29889,13,29896,29946
---
> token IDs: 0,25538,2211,25562,363,7952,292,9045,29891,29889,13,29896,29889,382,271,9045,29891,9687,29879,29889,13,29906,29889,1222,6269,895,25704,29889,13,29941,29889,3617,3307,8709,29889,13,29946,29889,4942,682,20947,310,4094,29889,13,29945,29889,399,1161,596,6567,4049,29889,13,29953,29889,399,1161,596,18655,1849,29889,13,29955,29889,399,1161,596,285,21211,29889,13,29947,29889,399,1161,596,27654,29889,13,29929,29889,399,1161,596,29808,29889,13,29896,29900,29889,399,1161,596,9427,29889,13,29896,29896,29889,399,1161,596,18423,29889,13,29896,29906,29889,399,1161,596,27274,29889,13,29896,29941,29889,399,1161,596,3623,625,29889,13,29896,29946,29889
14,16c14,16
< 12. Wash your cereal.
< 13. Wash your milk.
< 14
\ No newline at end of file
---
> 12. Wash your milk.
> 13. Wash your juice.
> 14.
\ No newline at end of file
I change the batch size to 2 .
Then I use the below command to execute the opt -6.7b
../build/inference/spec_infer/spec_infer -ll:gpu 4 -ll:fsize 14000 -ll:zsize 30000 -llm-model opt -llm-weight ../inference/weights/opt_6B_weights_half/ -llm-config ../inference/models/configs/opt_6B.json -ssm-model opt -ssm-weight ../inference/weights/opt_125M_weights_half/ -ssm-config ../inference/models/configs/opt_125M.json -tokenizer ../inference/tokenizer/ -prompt ../inference/prompt/test.json -output-file ../inference/output/spec_inference_opt_6B_half-batchsize-$batch_size.txt
The below is my error log