Closed reyna-abhyankar closed 6 months ago
What's the command line you used?
What's the command line you used?
./FlexFlow/build/inference/spec_infer/spec_infer -ll:gpu 1 -ll:fsize 32000 -ll:zsize 14000 -llm-model facebook/opt-6.7b -ssm-model facebook/opt-125m -prompt ./data/chatbot_short.json -output-file test_output
@reyna-abhyankar, you need to add the flag -ll:cpu 4
. We should update the docs. Let me know if you are still encountering the issue with the CPU flag.
Feel free to reopen if the issue is not fixed, closing for now after our discussion on Slack
Specifically, https://github.com/flexflow/FlexFlow/blob/0d75c1042bf87e45684bcb3679cfc9f39a87e589/src/runtime/request_manager.cc#L314
This may be a background server issue, since the
serve_xxxx()
functions aren't called.This is for running
spec_infer.cc
andincr_decoding.cc
examples.