llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.52k forks source link

[lldb-dap] incorrect offsets in assembly view #103021

Open vogelsgesang opened 1 month ago

vogelsgesang commented 1 month ago

In the disassemble call, the instructionOffset is treated as if it would be measuring the offset in bytes instead of as instructions (source code). This leads to an corrupted disassembly view in VS Code when scrolling

llvmbot commented 1 month ago

@llvm/issue-subscribers-lldb

Author: Adrian Vogelsgesang (vogelsgesang)

In the `disassemble` call, the `instructionOffset` is treated as if it would be measuring the offset in bytes instead of as instructions ([source code](https://github.com/llvm/llvm-project/blob/2913e71865dfc063a47ddfaf1e2ce07763f69614/lldb/tools/lldb-dap/lldb-dap.cpp#L3913)) this leads to an corrupted disassembly view in VS Code when scrolling
santhoshe447 commented 3 weeks ago

Hi @vogelsgesang

Could you pls provide more details to understand the issue? We noticed a crash in lldb-dap when scrolling in the disassembly view window. This was caused by incorrect handling of "instructionOffset" and "Offset" for the Hexagon arch (As it does instruction packetization). We have addressed this issue by handling all possible causes related to "instructionOffset" and "instructionCount".

"instructionOffset" applied before disassembling, as it will tell us from where to begin the disassemble. If the "instructionOffset" is positive - start disassembling after memoryReference. If the "instructionOffset" is negative - start disassembling before memoryReference.

If you know the significance of the "Offset" value in the disassemble request command, could you explain how it differs from "instructionOffset"?

Thanks,

vogelsgesang commented 3 weeks ago

If you know the significance of the "Offset" value in the disassemble request command, could you explain how it differs from "instructionOffset"?

The offset is measured in bytes, while the instruction offset is measured in instructions. This is particularly important for variable-length encoded instruction sets (such as Intel assembly), where there is no direct way to map from "instruction count" to "byte count"

VS-Code uses instructionOffset for lazy-loading / scrolling in the UI. I don't know what offset is actually useful for and if there are any users of it.

I just uploaded #105446 which shows my current progress at fixing this issue. The Pull Request is not completely finished, yet, though. In particular, the test cases are still broken.