Closed depenglee1707 closed 6 months ago
It's a mass commit, we have:
deployment
generator
normal rest
default
llamacpp
streaming
predict
It's a mass commit, we have:
deployment
exclusive forgenerator
andnormal rest
default
(transformer auto class) andllamacpp
support streamingstreaming
andpredict
(non stream) adopt one copy of code, avoid duplicated code