Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Incorrect remapping of DIArgList for LocalAsMetadata #51519

Open Quuxplusone opened 2 years ago

Quuxplusone commented 2 years ago
Bugzilla Link PR52552
Status CONFIRMED
Importance P normal
Reported by Mark Mendell (mark.p.mendell@intel.com)
Reported on 2021-11-18 17:59:40 -0800
Last modified on 2021-11-22 09:15:45 -0800
Version trunk
Hardware PC All
CC jeremy.morse.llvm@gmail.com, llvm-bugs@lists.llvm.org, mark.p.mendell@intel.com, paul_robinson@playstation.sony.com, stephen.tozer@sony.com
Fixed by commit(s)
Attachments foo.ll (1506 bytes, text/plain)
foo2.ll (221 bytes, text/plain)
patch (1486 bytes, text/plain)
Blocks
Blocked by
See also
Created attachment 25459
foo.ll - sample file containing dbg.value with DIArgList

Mapper::mapValue doesn't handle LocalAsMetadata in a DIArgList.  It replaces it
with undef, which is not properly written/read by the BitCode handlers.
This can be shown using llvm-link which causes remapping to be done.
foo.ll contains:
  call void @llvm.dbg.value(metadata !DIArgList(i32 0, i32 %A), metadata !5, metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus, DW_OP_stack_value)), !dbg !10

If it is linked with foo2.ll (nothing interesting in it), the result is
call void @llvm.dbg.value(metadata !DIArgList(i32 0, i32 undef), metadata !5,
metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
DW_OP_stack_value)), !dbg !10
If you then try running llvm-dis on the resulting BC file, it will crash
llvm-link -o result.bc foo.ll foo2.ll
llvm-dis result.bc # boom

The contents of foo.ll debug information is mostly garbage, just left over from
a cut down failing test case.

Tested fix:
ValueMapper.cpp ====
***************
*** 400,412 ****
          // be mapped via mapValue (apart from constants when we have no
          // module level changes, which have an identity mapping).
          if ((Flags & RF_NoModuleLevelChanges) && isa<ConstantAsMetadata>(VAM)) {
            MappedArgs.push_back(VAM);
          } else if (Value *LV = mapValue(VAM->getValue())) {
            MappedArgs.push_back(
!               LV == VAM->getValue() ? VAM : ValueAsMetadata::get(LV));
          } else {
            // If we cannot map the value, set the argument as undef.
            MappedArgs.push_back(ValueAsMetadata::get(
                UndefValue::get(VAM->getValue()->getType())));
          }
        }
--- 400,414 ----
          // be mapped via mapValue (apart from constants when we have no
          // module level changes, which have an identity mapping).
          if ((Flags & RF_NoModuleLevelChanges) && isa<ConstantAsMetadata>(VAM)) {
            MappedArgs.push_back(VAM);
          } else if (Value *LV = mapValue(VAM->getValue())) {
            MappedArgs.push_back(
!               LV == VAM->getValue() ? VAM : ValueAsMetadata::get(LV));
!         } else if (isa<LocalAsMetadata>(VAM)) {
!           MappedArgs.push_back(VAM);
          } else {
            // If we cannot map the value, set the argument as undef.
            MappedArgs.push_back(ValueAsMetadata::get(
                UndefValue::get(VAM->getValue()->getType())));
          }
        }
Quuxplusone commented 2 years ago

Attached foo.ll (1506 bytes, text/plain): foo.ll - sample file containing dbg.value with DIArgList

Quuxplusone commented 2 years ago

Attached foo2.ll (221 bytes, text/plain): foo2.ll - second file with another function in it for llvm-link

Quuxplusone commented 2 years ago

Attached patch (1486 bytes, text/plain): Proposed fix

Quuxplusone commented 2 years ago

Mapper::mapValue does handle LocalAsMetadata in the second clause if the if/else block:

  } else if (Value *LV = mapValue(VAM->getValue())) {
    MappedArgs.push_back(
        LV == VAM->getValue() ? VAM : ValueAsMetadata::get(LV));

This clause applies to both ConstantAsMetadata (if the RF_NoModuleLevelChanges flag isn't set) and LocalAsMetadata; it calls Mapper::mapValue on the value that the ValueAsMetadata points to, and then creates a new ValueAsMetadata that wraps that value. This is a necessary step because the original LocalAsMetadata may not be valid in the new scope, but may have an equivalent value that we can map to. The cause of the value being mapped to undef in this case would be that mapValue(i32 %A) is returning nullptr, which is a slightly different issue.

I am also surprised that llvm-dis is blowing up after encountering an undef value in a DIArgList - undef values in a debug intrinsic are commonly used to declare a missing value for a variable, and should be valid LLVM - this also seems like an error to me, either with llvm-dis or with the bitcode representation for a DIArgList.

Quuxplusone commented 2 years ago

Further investigation notes: It looks as though the difference between DIArgList and LocalAsMetadata is that DIArgList does not respect RF_IgnoreMissingLocals. Essentially, for LocalAsMetadata, it also fails to correctly map the value, but the RF_IgnoreMissingLocals flag causes it to leave the pointer unchanged instead of setting it to undef. I'm not sure whether the lack of a saved value mapping for the instructions is an issue, but at the very least DIArgList will have parity with LocalAsMetadata with the appropriate change.

The final fix is almost exactly the fix that was posted earlier, with the exception that the behaviour must only be used when RF_IgnoreMissingLocals is set.

Quuxplusone commented 2 years ago

Oh, one more thing I forgot - I can't reproduce the llvm-dis crash, do you have any more details for that one?

Quuxplusone commented 2 years ago

My version of LLVM is 2 levels downstream from ToT LLVM, so I wouldn't worry about it too much.

What does the final fix look like?

Quuxplusone commented 2 years ago
(In reply to Mark Mendell from comment #6)
> What does the final fix look like?

I've opened up a review for it here: https://reviews.llvm.org/D114355
Quuxplusone commented 2 years ago

Just FYI: The llvm-dis crash is caused by use being out of date wrt ToT and not having the fixes to properly write out the BC for the 'i32 0' argument, which causes a null pointer to be created in the internal IR for that operand. I am back-porting the fixes