Open hubertwang opened 1 week ago
Hi @hubertwang, can you please share your prompt and your max_length value?
Hi @hubertwang, can you please share your prompt and your max_length value?
Hi @natke,
Yes, I tried two relatively extreme conditions.
I input a privacy policy extract from app store, ask prompt to analyze The max length set to 200 (default from example), the output size will be around 800~900 and throw exception.
After I got this exception, I tried another prompt, expect short answers:
"How are you?" max limit 10
I except something like "I am good" or "good". But it throw exception output (11, 12) > (10)
Then I try to further limit the answer:
"How are you, answer good or no good" It still throw same exception. output (11, 12) > (10)
Note: The question is wrapped by the fine-tuned prompt format mentioned in the paper, with ## prefix.
To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again
To clarify: the max_length includes the prompt length + the answer. Try setting it to 200 and run your prompts again
Hi @natke,
Thanks for your reply. We'll keep that in mind and adjust the parameter.
Is it possible to catch this exception? Looks like the app will just crash for now, no chance to catch the exception. It's hard to estimate the output prompt may give us.
BTW, we also observed excessive memory usage when the prompt is longer. Seems longer prompt consume more memory.
I need to use iPhone 15 pro max to run certain prompt, which is the iPhone with the most memory for now.
Is it a expected behavior? Is it possible to control memorry message through search option?
Thank you.
Can you please add details of the exception you are seeing?
Hi @natke, yes, I added my sample code and screenshot while exception catched.
I used try-catch, or @try-@catch, but failed to catch the exception. But I can set a break point to stop it while throwing exception. Weird...
- (nullable NSString *)generate:(nonnull NSString*)input_user_question maxLength:(nonnull NSNumber*)max_length
{
__weak __typeof__(self) weakSelf = self;
NSMutableString *result = [NSMutableString string];
@try {
NSString* llmPath = [[NSBundle mainBundle] resourcePath];
const char* modelPath = [llmPath cStringUsingEncoding:NSUTF8StringEncoding];
auto model = OgaModel::Create(modelPath);
auto tokenizer = OgaTokenizer::Create(*model);
NSString* promptString = [NSString stringWithFormat:@"<|user|>\n%@<|end|>\n<|assistant|>", input_user_question];
const char* prompt = [promptString UTF8String];
auto sequences = OgaSequences::Create();
tokenizer->Encode(prompt, *sequences);
auto params = OgaGeneratorParams::Create(*model);
params->SetSearchOption("max_length", max_length.intValue);
params->SetInputSequences(*sequences);
// Streaming Output to generate token by token
auto tokenizer_stream = OgaTokenizerStream::Create(*tokenizer);
auto generator = OgaGenerator::Create(*model, *params);
while (!generator->IsDone()) {
generator->ComputeLogits();
generator->GenerateNextToken();
const int32_t* seq = generator->GetSequenceData(0);
size_t seq_len = generator->GetSequenceCount(0);
const char* decode_tokens = tokenizer_stream->Decode(seq[seq_len - 1]);
//NSLog(@"Decoded tokens: %s", decode_tokens);
// Add decoded token to SharedTokenUpdater
NSString* decodedTokenString = [NSString stringWithUTF8String:decode_tokens];
if (hasListeners) {// Only send events if anyone is listening
[weakSelf sendEventWithName:RCTOnnxEventGenTextTokenUpdate body:decodedTokenString];
}
//NSLog(@"[Phi-3] %@", decodedTokenString);
[result appendString:decodedTokenString];
}
} @catch (id exception) {
NSLog(@"Exception: %@", exception);
}
//NSLog(@"[Phi-3] Result: %@", result);
return result;
}
Exception:
libc++abi: terminating due to uncaught exception of type std::runtime_error: input sequence_length (11) is >= max_length (10)
Hi everyone,
I recently tried Phi-3 example (onnxruntime-inference-example/mobile/examples/phi-3) on iPhone. Sometimes the output of Phi-3 is more than my max_length. My app will crash since I am not able to catch the exception.
I tried Obj-C and C++ type try-catch, but all failed to catch this exception. Anyone has had the same issue?
Thanks!