May not even be a Transformers issue.. WizardLM-Uncensored-Falcon-40

Just could use some feedback on debugging with ctransformers, have a strange case where things are generally working, but occasionally I don't get output... using /models/WizardLM-Uncensored-Falcon-40b/ggml-model-falcon-40b-wizardlm-qt_k5.bin (GGML)

tokens = llm.tokenize('I want to give you a female name.  What is your favourite female names, give me your top five.  And a preference on what you preferred to be called.')

for token in llm.generate(tokens):
    print(llm.detokenize(token))

works always..

print(llm('I want to give you a female name.  What is your favourite female names, give me your top five.  And a preference on what you preferred to be called.'))

Sometimes there is NO output.

Scratching my head on how to debug this?

marella / ctransformers

May not even be a Transformers issue.. WizardLM-Uncensored-Falcon-40 #86