Open OswaldoBornemann opened 1 week ago
So I want to ask, is that only the AR LLM the streaming part, the NAR Flow Matching remains non-streaming?
flow matching is non-streaming, we use chunk inference to simulate streaming, but it is non-streaming inference actually
So I want to ask, is that only the AR LLM the streaming part, the NAR Flow Matching remains non-streaming?