Closed pdhirajkumarprasad closed 2 weeks ago
I think this is an issue with weight parameters handling, here is a full crash dump See this line
#20 0x00007e0514d070b9 mlir::iree_compiler::IREE::VM::ZIPArchiveWriter::flush(mlir::iree_compiler::FlatbufferBuilder&)
/home/nmeshram/iree/compiler/src/iree/compiler/Dialect/VM/Target/Bytecode/ArchiveWriter.cpp:661:20
I have seen this when weights were not correctly ellided but in this models input IR I dont see such an issue The provided model has these weights
util.global private @__auto.token_embd.weight = #stream.parameter.named<"model"::"token_embd.weight"> : tensor<128256x4096xf16>
util.global private @__auto.blk.0.attn_norm.weight = #stream.parameter.named<"model"::"blk.0.attn_norm.weight"> : tensor<4096xf32>
util.global private @__auto.blk.0.attn_q.weight = #stream.parameter.named<"model"::"blk.0.attn_q.weight"> : tensor<4096x4096xf16>
...
which I assume will be provided at runtime but we have this crash at compile time. @benvanik @MaheshRavishankar do you see something wrong in the input mlir?
Edit: oh I see this on line 7 of the provided mlir
util.global private @__auto.constant_8192_64_torch.complex64 = dense_resource<__auto.constant_8192_64_torch.complex64> : tensor<8192x64xcomplex<f32>>
Isnt this the same thing as being ellided without the proper annotation?
Same as the issues before - someone manually deleted resources that are required for correct processing of the IR. Is there some workflow people are now doing that involves deleting critical lines of MLIR files?
(we should have guards for not crashing and emitting an error, but emitting an error is the best we can do in these cases - so worth both having better errors and figuring out what workflow is causing this to happen as we've had multiple people doing it this week)
@pdhirajkumarprasad this seems like a user error. Please close if this is fixed. I am taking it out of the compilation error tracking project
this issue is no more there in nightly
What happened?
Following mlir getting seg fault.
https://raw.githubusercontent.com/nod-ai/llm-dev/main/models/llama.8b/llama.8b.fp16.mlir
Steps to reproduce your issue
wget https://raw.githubusercontent.com/nod-ai/llm-dev/main/models/llama.8b/llama.8b.fp16.mlir
iree-compile --iree-hal-target-backends=rocm --iree-input-demote-i64-to-i32 --iree-hip-target=gfx942 llama.8b.fp16.mlir
What component(s) does this issue relate to?
Compiler
Version information
No response
Additional context
No response