Indirect call address is possibly being loaded incorrectly

In the attached code I attempt to do a callback from C to a Swift function (actually a Swift lambda but I suspect the result would be the same. I'll check for completeness).

The _analogReadAsync function takes a pin and a callback as parameters. When compiling C code that passes a function to this (see the "test" functions), it works. Swift code calling into this fails to execute the callback correctly, with random results.

Looking at the LLVM IR produced by swiftc, it looks OK to my amateur eyes:

define void @_TF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_(i8, i8*) #2 {
entry:
  %2 = call zeroext i1 @_analogReadAsync(i8 zeroext %0, void (i16)* @_TToFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_)
  ret void
}

declare zeroext i1 @_analogReadAsync(i8 zeroext, void (i16)*) #1

to set the callback to this function...

define linkonce_odr hidden void @_TToFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_(i16 zeroext) #2 {
entry:
  call void @_TFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_(i16 %0) #3
  ret void
}

This is the assembly emitted...

000002fc <_TF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_>:
     2fc:   cf 93           push    r28
     2fe:   df 93           push    r29
     300:   cd b7           in  r28, 0x3d   ; 61
     302:   de b7           in  r29, 0x3e   ; 62
     304:   6a e2           ldi r22, 0x2A   ; 42
     306:   73 e0           ldi r23, 0x03   ; 3
     308:   0e 94 8f 05     call    0xb1e   ; 0xb1e <_analogReadAsync>
     30c:   df 91           pop r29
     30e:   cf 91           pop r28
     310:   08 95           ret

0000032a <_TToFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_>:
     32a:   cf 93           push    r28
     32c:   df 93           push    r29
     32e:   cd b7           in  r28, 0x3d   ; 61
     330:   de b7           in  r29, 0x3e   ; 62
     332:   0e 94 89 01     call    0x312   ; 0x312 <_TFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_>
     336:   df 91           pop r29
     338:   cf 91           pop r28
     33a:   08 95           ret

At first I couldn't see the problem, r22 and r23 are set to 0x032a, which is the address of the callback function. But when this is subsequently passed around and used for an ICALL later, it all goes wrong. Because ICALL expects a program counter value, not an absolute address. And the program counter refers to 16 bit wide addresses. So we should be loading 0x195 into the program counter, 0x01 into r23 and 0x95 into r22.

Compare this to what avr-gcc does...

This test code:

void _analogReadAsyncTestCallback(unsigned short value) {
    _digitalWrite(13, true);
}

void _analogReadAsyncTest(unsigned char pin) {
    _analogReadAsync(pin,_analogReadAsyncTestCallback);
}

Creates this assembly:

00000b8c <_analogReadAsyncTest>:
     b8c:   61 e2           ldi r22, 0x21   ; 33
     b8e:   74 e0           ldi r23, 0x04   ; 4
     b90:   0c 94 8f 05     jmp 0xb1e   ; 0xb1e <_analogReadAsync>

00000842 <_analogReadAsyncTestCallback>:
     842:   61 e0           ldi r22, 0x01   ; 1
     844:   8d e0           ldi r24, 0x0D   ; 13
     846:   0c 94 9c 02     jmp 0x538   ; 0x538 <_digitalWrite>

So here you can see, to get to address 0x842, the compiler creates code to load r22/r23 with the program counter value 0x0421, exactly half the address.

avrio.c.txt main.elf.s.txt pins.ll.txt pins.swift.txt

Operational note: I'd highly encourage you to read up on GitHub's Markdown syntax. Pasting giant blocks of code without proper formatting makes it extremely difficult to read and follow along. I've been editing your previous posts, but I'd rather not need to.

OK, good point, sorry shepmaster... I'll try to be more careful. Thank you for tidying up after me!

More analysis...

Looking at this fairly simple reductive case, and using clang/llvm to study the back end behaviour instead of swiftc/llvm (where I originally discovered the bug), created a reduced test case (extract here from analog.c):

void _analogReadAsyncTestCallback(unsigned short value) {
    _digitalWrite(13, true);
}

void _analogReadAsyncTest(unsigned char pin) {
    _analogReadAsync(pin,_analogReadAsyncTestCallback);
}

// returns false if a conversion is already in progress or the pin number is invalid
__attribute__((noinline)) _Bool _analogReadAsync(unsigned char pin, adcCallback callback) {
    if (pin>7) return false;

    if (!_adcStarted()) {
        _startupADC();
    }

    // wait for any pending conversion and/or callback to complete
    if ((ADCSRA&(1<<ADSC))||currentCallback) {
        return false;
    } else {
        currentCallback = callback;
        ADMUX = pin & 7 | 0x40;
        ADCSRA |= (1<<ADSC)|(1<<ADIE);
        return true; // we started the conversion cleanly
    }
}

Looking at the assembly for _analogReadAsyncTest, I can see a difference in how the fixups are generated.

analog.s is assembly created by avr-gcc -S analog.o.s is assembly created by avr-gcc -c followed by disassembly with avr-objdump analog.clang.s is assembly created by clang -S (clang compiled with the avr backend)

analog.c.txt analog.clang.s.txt analog.o.s.txt analog.s.txt

analog.clang.s (assembly direct from clang to .s)
***

    .globl  _analogReadAsyncTest
    .p2align    1
    .type   _analogReadAsyncTest,@function
_analogReadAsyncTest:                   ; @_analogReadAsyncTest
; BB#0:                                 ; %entry
    push    r28
    push    r29
    in  r28, 61
    in  r29, 62
    ldi r22, lo8(_analogReadAsyncTestCallback)
    ldi r23, hi8(_analogReadAsyncTestCallback)
    call    _analogReadAsync
    pop r29
    pop r28
    ret

analog.s (assembly direct from gcc to .s)
***
.global _analogReadAsyncTest
    .type   _analogReadAsyncTest, @function
_analogReadAsyncTest:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
    ldi r22,lo8(gs(_analogReadAsyncTestCallback))
    ldi r23,hi8(gs(_analogReadAsyncTestCallback))
    jmp _analogReadAsync
    .size   _analogReadAsyncTest, .-_analogReadAsyncTest
    .local  currentCallback
    .comm   currentCallback,2,1
    .ident  "GCC: (GNU) 6.3.0"

analog.o.s
***
0000034a <_analogReadAsyncTest>:
 34a:   60 e0           ldi r22, 0x00   ; 0
            34a: R_AVR_LO8_LDI_GS   .text
 34c:   70 e0           ldi r23, 0x00   ; 0
            34c: R_AVR_HI8_LDI_GS   .text
 34e:   0c 94 00 00     jmp 0   ; 0x0 <_analogReadAsyncTestCallback>
            34e: R_AVR_CALL .text+0x2dc

The fixups generated by avr-gcc look like R_AVR_LO8_LDI_GS / R_AVR_HI8_LDI_GS or (equivalently?) lo8(gs(_analogReadAsyncTestCallback)) / hi8(gs(_analogReadAsyncTestCallback)).

Whereas the fixups generated by clang/llvm look like lo8(_analogReadAsyncTestCallback) / hi8(_analogReadAsyncTestCallback).

FYI, the relocs are in include/llvm/Support/ELFRelocs/AVR.def


#ifndef ELF_RELOC
#error "ELF_RELOC must be defined"
#endif

ELF_RELOC(R_AVR_NONE,                  0)
ELF_RELOC(R_AVR_32,                    1)
ELF_RELOC(R_AVR_7_PCREL,               2)
ELF_RELOC(R_AVR_13_PCREL,              3)
ELF_RELOC(R_AVR_16,                    4)
ELF_RELOC(R_AVR_16_PM,                 5)
ELF_RELOC(R_AVR_LO8_LDI,               6)
ELF_RELOC(R_AVR_HI8_LDI,               7)
ELF_RELOC(R_AVR_HH8_LDI,               8)
ELF_RELOC(R_AVR_LO8_LDI_NEG,           9)
ELF_RELOC(R_AVR_HI8_LDI_NEG,          10)
ELF_RELOC(R_AVR_HH8_LDI_NEG,          11)
ELF_RELOC(R_AVR_LO8_LDI_PM,           12)
ELF_RELOC(R_AVR_HI8_LDI_PM,           13)
ELF_RELOC(R_AVR_HH8_LDI_PM,           14)
ELF_RELOC(R_AVR_LO8_LDI_PM_NEG,       15)
ELF_RELOC(R_AVR_HI8_LDI_PM_NEG,       16)
ELF_RELOC(R_AVR_HH8_LDI_PM_NEG,       17)
ELF_RELOC(R_AVR_CALL,                 18)
ELF_RELOC(R_AVR_LDI,                  19)
ELF_RELOC(R_AVR_6,                    20)
ELF_RELOC(R_AVR_6_ADIW,               21)
ELF_RELOC(R_AVR_MS8_LDI,              22)
ELF_RELOC(R_AVR_MS8_LDI_NEG,          23)
ELF_RELOC(R_AVR_LO8_LDI_GS,           24)
ELF_RELOC(R_AVR_HI8_LDI_GS,           25)
ELF_RELOC(R_AVR_8,                    26)
ELF_RELOC(R_AVR_8_LO8,                27)
ELF_RELOC(R_AVR_8_HI8,                28)
ELF_RELOC(R_AVR_8_HLO8,               29)
ELF_RELOC(R_AVR_SYM_DIFF,             30)
ELF_RELOC(R_AVR_16_LDST,              31)
ELF_RELOC(R_AVR_LDS_STS_16,           33)
ELF_RELOC(R_AVR_PORT6,                34)
ELF_RELOC(R_AVR_PORT5,                35)

If we modify the back end to use these _gs fixups, a few changes will be needed.

For example in MCTargetDesc/AVRAsmBackend.cpp

// Prepare value for the target space for it
void AVRAsmBackend::adjustFixupValue(const MCFixup &Fixup, uint64_t &Value,
                                     MCContext *Ctx) const {
  // The size of the fixup in bits.
  uint64_t Size = AVRAsmBackend::getFixupKindInfo(Fixup.getKind()).TargetSize;

  unsigned Kind = Fixup.getKind();

  switch (Kind) {
  default:
    llvm_unreachable("unhandled fixup");
  case AVR::fixup_7_pcrel:
    adjust::fixup_7_pcrel(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_13_pcrel:
    adjust::fixup_13_pcrel(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_call:
    adjust::fixup_call(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_ldi:
    adjust::ldi::fixup(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_lo8_ldi:
  case AVR::fixup_lo8_ldi_pm:
    if (Kind == AVR::fixup_lo8_ldi_pm) adjust::pm(Value);

    adjust::ldi::lo8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_hi8_ldi:
  case AVR::fixup_hi8_ldi_pm:
    if (Kind == AVR::fixup_hi8_ldi_pm) adjust::pm(Value);

    adjust::ldi::hi8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_hh8_ldi:
  case AVR::fixup_hh8_ldi_pm:
    if (Kind == AVR::fixup_hh8_ldi_pm) adjust::pm(Value);

    adjust::ldi::hh8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_ms8_ldi:
    adjust::ldi::ms8(Size, Fixup, Value, Ctx);
    break;

  case AVR::fixup_lo8_ldi_neg:
  case AVR::fixup_lo8_ldi_pm_neg:
    if (Kind == AVR::fixup_lo8_ldi_pm_neg) adjust::pm(Value);

    adjust::ldi::neg(Value);
    adjust::ldi::lo8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_hi8_ldi_neg:
  case AVR::fixup_hi8_ldi_pm_neg:
    if (Kind == AVR::fixup_hi8_ldi_pm_neg) adjust::pm(Value);

    adjust::ldi::neg(Value);
    adjust::ldi::hi8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_hh8_ldi_neg:
  case AVR::fixup_hh8_ldi_pm_neg:
    if (Kind == AVR::fixup_hh8_ldi_pm_neg) adjust::pm(Value);

    adjust::ldi::neg(Value);
    adjust::ldi::hh8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_ms8_ldi_neg:
    adjust::ldi::neg(Value);
    adjust::ldi::ms8(Size, Fixup, Value, Ctx);
    break;
  case AVR::fixup_16:
    adjust::unsigned_width(16, Value, std::string("port number"), Fixup, Ctx);

    Value &= 0xffff;
    break;
  case AVR::fixup_6_adiw:
    adjust::fixup_6_adiw(Fixup, Value, Ctx);
    break;

  case AVR::fixup_port5:
    adjust::fixup_port5(Fixup, Value, Ctx);
    break;

  case AVR::fixup_port6:
    adjust::fixup_port6(Fixup, Value, Ctx);
    break;

  // Fixups which do not require adjustments.
  case FK_Data_2:
  case FK_Data_4:
  case FK_Data_8:
    break;

  case FK_GPRel_4:
    llvm_unreachable("don't know how to adjust this fixup");
    break;
  }
}

and in MCTargetDesc/AVRFixupKinds.h

enum Fixups {
  /// A 32-bit AVR fixup.
  fixup_32 = FirstTargetFixupKind,

  /// A 7-bit PC-relative fixup for the family of conditional
  /// branches which take 7-bit targets (BRNE,BRGT,etc).
  fixup_7_pcrel,
  /// A 12-bit PC-relative fixup for the family of branches
  /// which take 12-bit targets (RJMP,RCALL,etc).
  /// \note Although the fixup is labelled as 13 bits, it
  ///       is actually only encoded in 12. The reason for
  ///       The nonmenclature is that AVR branch targets are
  ///       rightshifted by 1, because instructions are always
  ///       aligned to 2 bytes, so the 0'th bit is always 0.
  ///       This way there is 13-bits of precision.
  fixup_13_pcrel,

  /// A 16-bit address.
  fixup_16,
  /// A 16-bit program memory address.
  fixup_16_pm,

  /// Replaces the 8-bit immediate with another value.
  fixup_ldi,

  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the lower 8 bits of a 16-bit value (bits 0-7).
  fixup_lo8_ldi,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a 16-bit value (bits 8-15).
  fixup_hi8_ldi,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a 24-bit value (bits 16-23).
  fixup_hh8_ldi,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a 32-bit value (bits 24-31).
  fixup_ms8_ldi,

  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the lower 8 bits of a negated 16-bit value (bits 0-7).
  fixup_lo8_ldi_neg,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a negated 16-bit value (bits 8-15).
  fixup_hi8_ldi_neg,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a negated negated 24-bit value (bits 16-23).
  fixup_hh8_ldi_neg,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a negated negated 32-bit value (bits 24-31).
  fixup_ms8_ldi_neg,

  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the lower 8 bits of a 16-bit program memory address value (bits 0-7).
  fixup_lo8_ldi_pm,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a 16-bit program memory address value (bits
  /// 8-15).
  fixup_hi8_ldi_pm,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a 24-bit program memory address value (bits
  /// 16-23).
  fixup_hh8_ldi_pm,

  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the lower 8 bits of a negated 16-bit program memory address value
  /// (bits 0-7).
  fixup_lo8_ldi_pm_neg,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a negated 16-bit program memory address value
  /// (bits 8-15).
  fixup_hi8_ldi_pm_neg,
  /// Replaces the immediate operand of a 16-bit `Rd, K` instruction
  /// with the upper 8 bits of a negated 24-bit program memory address value
  /// (bits 16-23).
  fixup_hh8_ldi_pm_neg,

  /// A 22-bit fixup for the target of a `CALL k` or `JMP k` instruction.
  fixup_call,

  fixup_6,
  /// A symbol+addr fixup for the `LDD <x>+<n>, <r>" family of instructions.
  fixup_6_adiw,

  fixup_lo8_ldi_gs,
  fixup_hi8_ldi_gs,

  fixup_8,
  fixup_8_lo8,
  fixup_8_hi8,
  fixup_8_hlo8,

  /// Fixup to calculate the difference between two symbols.
  /// Is the only stateful fixup. We do not support it yet.
  fixup_sym_diff,
  fixup_16_ldst,

  fixup_lds_sts_16,

  /// A 6-bit port address.
  fixup_port6,
  /// A 5-bit port address.
  fixup_port5,

  // Marker
  LastTargetFixupKind,
  NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind
};

the comment should be filled out to help future programmers.

Finally in MCTargetDesc/AVRMCExpr.cpp, getFixupKind should be extended:

AVR::Fixups AVRMCExpr::getFixupKind() const {
  AVR::Fixups Kind = AVR::Fixups::LastTargetFixupKind;

  switch (getKind()) {
  case VK_AVR_LO8:
    Kind = isNegated() ? AVR::fixup_lo8_ldi_neg : AVR::fixup_lo8_ldi;
    break;
  case VK_AVR_HI8:
    Kind = isNegated() ? AVR::fixup_hi8_ldi_neg : AVR::fixup_hi8_ldi;
    break;
  case VK_AVR_HH8:
    Kind = isNegated() ? AVR::fixup_hh8_ldi_neg : AVR::fixup_hh8_ldi;
    break;
  case VK_AVR_HHI8:
    Kind = isNegated() ? AVR::fixup_ms8_ldi_neg : AVR::fixup_ms8_ldi;
    break;

  case VK_AVR_PM_LO8:
    Kind = isNegated() ? AVR::fixup_lo8_ldi_pm_neg : AVR::fixup_lo8_ldi_pm;
    break;
  case VK_AVR_PM_HI8:
    Kind = isNegated() ? AVR::fixup_hi8_ldi_pm_neg : AVR::fixup_hi8_ldi_pm;
    break;
  case VK_AVR_PM_HH8:
    Kind = isNegated() ? AVR::fixup_hh8_ldi_pm_neg : AVR::fixup_hh8_ldi_pm;
    break;

  case VK_AVR_None:
    llvm_unreachable("Uninitialized expression");
  }

  return Kind;
}

to cover the _GS fixups. Any other fixes needed can be picked up by a search for fixup_lo8_ldi.

This also points to the next step, the enum...

class AVRMCExpr : public MCTargetExpr {
public:
  /// Specifies the type of an expression.
  enum VariantKind {
    VK_AVR_None,

    VK_AVR_HI8,  ///< Corresponds to `hi8()`.
    VK_AVR_LO8,  ///< Corresponds to `lo8()`.
    VK_AVR_HH8,  ///< Corresponds to `hlo8() and hh8()`.
    VK_AVR_HHI8, ///< Corresponds to `hhi8()`.

    VK_AVR_PM_LO8, ///< Corresponds to `pm_lo8()`.
    VK_AVR_PM_HI8, ///< Corresponds to `pm_hi8()`.
    VK_AVR_PM_HH8  ///< Corresponds to `pm_hh8()`.
  };

from MCTargetDesc/AVRMCExpr.h should be extended with _GS types. And search for associated improvements using VK_AVR_LO8 to make sure that machine code writing, assembly writing and assembly reading all work.

The driving engine of this should probably(?) be an extra target flag in AVRMCInstLower::lowerSymbolOperand from AVRMCInstLower.cpp, where the machine code instance (MC) for the fixups are created:

MCOperand AVRMCInstLower::lowerSymbolOperand(const MachineOperand &MO,
                                             MCSymbol *Sym) const {
  unsigned char TF = MO.getTargetFlags();
  const MCExpr *Expr = MCSymbolRefExpr::create(Sym, Ctx);

  bool IsNegated = false;
  if (TF & AVRII::MO_NEG) { IsNegated = true; }

  if (!MO.isJTI() && MO.getOffset()) {
    Expr = MCBinaryExpr::createAdd(
        Expr, MCConstantExpr::create(MO.getOffset(), Ctx), Ctx);
  }

  if (TF & AVRII::MO_LO) {
    Expr = AVRMCExpr::create(AVRMCExpr::VK_AVR_LO8, Expr, IsNegated, Ctx);
  } else if (TF & AVRII::MO_HI) {
    Expr = AVRMCExpr::create(AVRMCExpr::VK_AVR_HI8, Expr, IsNegated, Ctx);
  } else if (TF != 0) {
    llvm_unreachable("Unknown target flag on symbol operand");
  }

  return MCOperand::createExpr(Expr);
}

The target flags are defined in AVRInstrInfo.h:

enum TOF {
  MO_NO_FLAG,

  /// On a symbol operand, this represents the lo part.
  MO_LO = (1 << 1),

  /// On a symbol operand, this represents the hi part.
  MO_HI = (1 << 2),

  /// On a symbol operand, this represents it has to be negated.
  MO_NEG = (1 << 3)
};

We could add an extra flag in there like:

  /// On a symbol operand, this represents a function, these need special fixups on AVR to get the address right (it's a PC value not a byte address).
  MO_GS = (1 << 4)

I am not clear where the target flags are set and how we would pick up that the address being interpreted is that of a function so the _GS flag would need to be applied. Any help gratefully received!!

The IR in the description is missing a forward declaration, replicating the entire file in one block here for easy copy-paste.

define linkonce_odr hidden void @_TToFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_(i16 zeroext) #2 {
entry:
  call void @_TFF3AVR15analogReadAsyncFT3pinVs5UInt88callbackcVs6UInt16T__T_U_FS1_T_(i16 %0) #3
  ret void
}

EDIT Didn't realise there were file links in the description

I've looked into this a lot more I believe that we should be using the *_GS fixups, but only when the address is in program memory.

Currently in AVRMCInstLower.cpp:71, we check for GlobalAddress and emit an AVRMCExpr with one of the LDI fixups if needed. Given that we have access to the GlobalValue* (see MO.getGlobal()), we can lookup the type of the value (which will be a pointer type) and we can then look up the address space on there.

Here is the predicate we should be able to use:

int AddrSpace = ((llvm::PointerType*)MO.getGlobal()->getType())->getAddressSpace();

switch (AddrSpace) {
    case AVR::DataMemory: /* use the standard LDI_* fixup to access ram */
    case AVR::FlashMemory: /* use the *_GS variant to access program memory */
}

The reason we cannot is because LLVM defaults everything to address space zero (data memory, RAM), including functions and global variables.

We will need to do something similar to #47. In that change we made switch lookup table global variables default to address space one (program memory). We should do the same to functions, any anything else that it make sense on.

Once we have the correct address space information associated with all our pointer types, we can then check the address space inside lowerSymbolOperand and fix this properly.

Here's the C for the full testcase

#include <avr/io.h>

#define true 1
#define false 0

typedef void (*Callback)(unsigned short);

void _analogReadAsync(unsigned char pin, Callback);

void _digitalWrite(unsigned char pin, _Bool a);

void _analogReadAsyncTestCallback(unsigned short value) {
    _digitalWrite(13, true);
}

void _analogReadAsyncTest(unsigned char pin) {
    _analogReadAsync(pin,_analogReadAsyncTestCallback);
}

__attribute__((noinline)) unsigned char pinMask(unsigned char pin) {
    if (pin<8) {
        return ((unsigned char)1) << pin;
    } else if (pin<14) {
        return ((unsigned char)1) << (pin - (unsigned char)8);
    }

    return 0;
}

// reads as false if you pass an invalid pin number like 14
_Bool _digitalRead(unsigned char pin) {
    unsigned char mask = pinMask(pin);

    if (!mask) return false;

    if (pin<8) {
        return PIND & mask;
    } else {
        return PINB & mask;
    }

    return false;
}

/*
    These functions are stupidly verbose and spread out to try and force the compiler to make code that works.
    Currently there are bugs in the AVR back end that produce non functional code unless we force its hand.
*/

__attribute__((noinline)) void resetPortD(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        PORTD &= ~mask;
    }
}

__attribute__((noinline)) void resetPortB(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        PORTB &= ~mask;
    }
}

__attribute__((noinline)) void setPortD(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        PORTD |= mask;
    }
}

__attribute__((noinline)) void setPortB(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        PORTB |= mask;
    }
}

void _digitalWrite(unsigned char pin, _Bool value) {
    if (value) {
        if (pin<8) {
            setPortD(pin);
        } else {
            setPortB(pin);
        }
    } else {
        if (pin<8) {
            resetPortD(pin);
        } else {
            resetPortB(pin);
        }
    }
}

void enablePortDWrite(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        DDRD |= mask;
    }
}

void enablePortBWrite(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        DDRB |= mask;
    }
}

void disablePortDWrite(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        DDRD &= ~mask;
    }
}

void disablePortBWrite(unsigned char pin) {
    unsigned char mask = pinMask(pin);
    if (mask) {
        DDRB &= ~mask;
    }
}

void _pinMode(unsigned char pin, _Bool write) {
    if (write) {
        if (pin<8) {
            enablePortDWrite(pin);
        } else {
            enablePortBWrite(pin);
        }
    } else {
        if (pin<8) {
            disablePortDWrite(pin);
        } else {
            disablePortBWrite(pin);
        }
    }
}

Here it is in IR form (note: uses IR syntax from LLVM master)

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-p:16:16:16-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-n8"
target triple = "avr-atmel-none"

; Function Attrs: noinline nounwind optnone
define void @_analogReadAsyncTestCallback(i16 zeroext %value) #0 {
entry:
  %value.addr = alloca i16, align 2
  store i16 %value, i16* %value.addr, align 2
  call void @_digitalWrite(i8 zeroext 13, i1 zeroext true)
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @_digitalWrite(i8 zeroext %pin, i1 zeroext %value) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %value.addr = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %frombool = zext i1 %value to i8
  store i8 %frombool, i8* %value.addr, align 1
  %0 = load i8, i8* %value.addr, align 1
  %tobool = trunc i8 %0 to i1
  br i1 %tobool, label %if.then, label %if.else3

if.then:                                          ; preds = %entry
  %1 = load i8, i8* %pin.addr, align 1
  %conv = zext i8 %1 to i16
  %cmp = icmp slt i16 %conv, 8
  br i1 %cmp, label %if.then2, label %if.else

if.then2:                                         ; preds = %if.then
  %2 = load i8, i8* %pin.addr, align 1
  call void @setPortD(i8 zeroext %2)
  br label %if.end

if.else:                                          ; preds = %if.then
  %3 = load i8, i8* %pin.addr, align 1
  call void @setPortB(i8 zeroext %3)
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then2
  br label %if.end10

if.else3:                                         ; preds = %entry
  %4 = load i8, i8* %pin.addr, align 1
  %conv4 = zext i8 %4 to i16
  %cmp5 = icmp slt i16 %conv4, 8
  br i1 %cmp5, label %if.then7, label %if.else8

if.then7:                                         ; preds = %if.else3
  %5 = load i8, i8* %pin.addr, align 1
  call void @resetPortD(i8 zeroext %5)
  br label %if.end9

if.else8:                                         ; preds = %if.else3
  %6 = load i8, i8* %pin.addr, align 1
  call void @resetPortB(i8 zeroext %6)
  br label %if.end9

if.end9:                                          ; preds = %if.else8, %if.then7
  br label %if.end10

if.end10:                                         ; preds = %if.end9, %if.end
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @_analogReadAsyncTest(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  call void @_analogReadAsync(i8 zeroext %0, void (i16)* @_analogReadAsyncTestCallback)
  ret void
}

declare void @_analogReadAsync(i8 zeroext, void (i16)*) #1

; Function Attrs: noinline nounwind optnone
define zeroext i8 @pinMask(i8 zeroext %pin) #0 {
entry:
  %retval = alloca i8, align 1
  %pin.addr = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %conv = zext i8 %0 to i16
  %cmp = icmp slt i16 %conv, 8
  br i1 %cmp, label %if.then, label %if.else

if.then:                                          ; preds = %entry
  %1 = load i8, i8* %pin.addr, align 1
  %conv2 = zext i8 %1 to i16
  %shl = shl i16 1, %conv2
  %conv3 = trunc i16 %shl to i8
  store i8 %conv3, i8* %retval, align 1
  br label %return

if.else:                                          ; preds = %entry
  %2 = load i8, i8* %pin.addr, align 1
  %conv4 = zext i8 %2 to i16
  %cmp5 = icmp slt i16 %conv4, 14
  br i1 %cmp5, label %if.then7, label %if.end

if.then7:                                         ; preds = %if.else
  %3 = load i8, i8* %pin.addr, align 1
  %conv8 = zext i8 %3 to i16
  %sub = sub nsw i16 %conv8, 8
  %shl9 = shl i16 1, %sub
  %conv10 = trunc i16 %shl9 to i8
  store i8 %conv10, i8* %retval, align 1
  br label %return

if.end:                                           ; preds = %if.else
  br label %if.end11

if.end11:                                         ; preds = %if.end
  store i8 0, i8* %retval, align 1
  br label %return

return:                                           ; preds = %if.end11, %if.then7, %if.then
  %4 = load i8, i8* %retval, align 1
  ret i8 %4
}

; Function Attrs: noinline nounwind optnone
define zeroext i1 @_digitalRead(i8 zeroext %pin) #0 {
entry:
  %retval = alloca i1, align 1
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  store i1 false, i1* %retval, align 1
  br label %return

if.end:                                           ; preds = %entry
  %2 = load i8, i8* %pin.addr, align 1
  %conv = zext i8 %2 to i16
  %cmp = icmp slt i16 %conv, 8
  br i1 %cmp, label %if.then2, label %if.else

if.then2:                                         ; preds = %if.end
  %3 = load volatile i8, i8* inttoptr (i16 41 to i8*), align 1
  %conv3 = zext i8 %3 to i16
  %4 = load i8, i8* %mask, align 1
  %conv4 = zext i8 %4 to i16
  %and = and i16 %conv3, %conv4
  %tobool5 = icmp ne i16 %and, 0
  store i1 %tobool5, i1* %retval, align 1
  br label %return

if.else:                                          ; preds = %if.end
  %5 = load volatile i8, i8* inttoptr (i16 35 to i8*), align 1
  %conv6 = zext i8 %5 to i16
  %6 = load i8, i8* %mask, align 1
  %conv7 = zext i8 %6 to i16
  %and8 = and i16 %conv6, %conv7
  %tobool9 = icmp ne i16 %and8, 0
  store i1 %tobool9, i1* %retval, align 1
  br label %return

return:                                           ; preds = %if.else, %if.then2, %if.then
  %7 = load i1, i1* %retval, align 1
  ret i1 %7
}

; Function Attrs: noinline nounwind optnone
define void @resetPortD(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %neg = xor i16 %conv, -1
  %3 = load volatile i8, i8* inttoptr (i16 43 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %and = and i16 %conv1, %neg
  %conv2 = trunc i16 %and to i8
  store volatile i8 %conv2, i8* inttoptr (i16 43 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @resetPortB(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %neg = xor i16 %conv, -1
  %3 = load volatile i8, i8* inttoptr (i16 37 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %and = and i16 %conv1, %neg
  %conv2 = trunc i16 %and to i8
  store volatile i8 %conv2, i8* inttoptr (i16 37 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @setPortD(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %3 = load volatile i8, i8* inttoptr (i16 43 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %or = or i16 %conv1, %conv
  %conv2 = trunc i16 %or to i8
  store volatile i8 %conv2, i8* inttoptr (i16 43 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @setPortB(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %3 = load volatile i8, i8* inttoptr (i16 37 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %or = or i16 %conv1, %conv
  %conv2 = trunc i16 %or to i8
  store volatile i8 %conv2, i8* inttoptr (i16 37 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @enablePortDWrite(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %3 = load volatile i8, i8* inttoptr (i16 42 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %or = or i16 %conv1, %conv
  %conv2 = trunc i16 %or to i8
  store volatile i8 %conv2, i8* inttoptr (i16 42 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @enablePortBWrite(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %3 = load volatile i8, i8* inttoptr (i16 36 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %or = or i16 %conv1, %conv
  %conv2 = trunc i16 %or to i8
  store volatile i8 %conv2, i8* inttoptr (i16 36 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @disablePortDWrite(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %neg = xor i16 %conv, -1
  %3 = load volatile i8, i8* inttoptr (i16 42 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %and = and i16 %conv1, %neg
  %conv2 = trunc i16 %and to i8
  store volatile i8 %conv2, i8* inttoptr (i16 42 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @disablePortBWrite(i8 zeroext %pin) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %mask = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %0 = load i8, i8* %pin.addr, align 1
  %call = call zeroext i8 @pinMask(i8 zeroext %0)
  store i8 %call, i8* %mask, align 1
  %1 = load i8, i8* %mask, align 1
  %tobool = icmp ne i8 %1, 0
  br i1 %tobool, label %if.then, label %if.end

if.then:                                          ; preds = %entry
  %2 = load i8, i8* %mask, align 1
  %conv = zext i8 %2 to i16
  %neg = xor i16 %conv, -1
  %3 = load volatile i8, i8* inttoptr (i16 36 to i8*), align 1
  %conv1 = zext i8 %3 to i16
  %and = and i16 %conv1, %neg
  %conv2 = trunc i16 %and to i8
  store volatile i8 %conv2, i8* inttoptr (i16 36 to i8*), align 1
  br label %if.end

if.end:                                           ; preds = %if.then, %entry
  ret void
}

; Function Attrs: noinline nounwind optnone
define void @_pinMode(i8 zeroext %pin, i1 zeroext %write) #0 {
entry:
  %pin.addr = alloca i8, align 1
  %write.addr = alloca i8, align 1
  store i8 %pin, i8* %pin.addr, align 1
  %frombool = zext i1 %write to i8
  store i8 %frombool, i8* %write.addr, align 1
  %0 = load i8, i8* %write.addr, align 1
  %tobool = trunc i8 %0 to i1
  br i1 %tobool, label %if.then, label %if.else3

if.then:                                          ; preds = %entry
  %1 = load i8, i8* %pin.addr, align 1
  %conv = zext i8 %1 to i16
  %cmp = icmp slt i16 %conv, 8
  br i1 %cmp, label %if.then2, label %if.else

if.then2:                                         ; preds = %if.then
  %2 = load i8, i8* %pin.addr, align 1
  call void @enablePortDWrite(i8 zeroext %2)
  br label %if.end

if.else:                                          ; preds = %if.then
  %3 = load i8, i8* %pin.addr, align 1
  call void @enablePortBWrite(i8 zeroext %3)
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then2
  br label %if.end10

if.else3:                                         ; preds = %entry
  %4 = load i8, i8* %pin.addr, align 1
  %conv4 = zext i8 %4 to i16
  %cmp5 = icmp slt i16 %conv4, 8
  br i1 %cmp5, label %if.then7, label %if.else8

if.then7:                                         ; preds = %if.else3
  %5 = load i8, i8* %pin.addr, align 1
  call void @disablePortDWrite(i8 zeroext %5)
  br label %if.end9

if.else8:                                         ; preds = %if.else3
  %6 = load i8, i8* %pin.addr, align 1
  call void @disablePortBWrite(i8 zeroext %6)
  br label %if.end9

if.end9:                                          ; preds = %if.else8, %if.then7
  br label %if.end10

if.end10:                                         ; preds = %if.end9, %if.end
  ret void
}

attributes #0 = { noinline nounwind optnone "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="atmega328p" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="atmega328p" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 2}
!1 = !{!"clang version 5.0.0 (trunk 307771) (llvm/trunk 307882)"}

Here's a really reduced testcase

target datalayout = "e-p:16:16:16-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-n8"
target triple = "avr-atmel-none"

declare void @_analogReadAsyncTestCallback(i16 zeroext)

; Function Attrs: noinline nounwind optnone
define void @_analogReadAsyncTest(i8 zeroext %pin) {
entry:
  call void @_analogReadAsync(i8 zeroext undef, void (i16)* @_analogReadAsyncTestCallback)
  ret void
}

declare void @_analogReadAsync(i8 zeroext, void (i16)*)

Workaround commited to LLVM in r307888 and cherry-picked in b8a10626c7c1c8a8c06d042e416c3ee3e5dce8cf.

I'll leave this ticket open so I don't forget about the permanent fix.

Here is a slightly more detailed test case.

I've written a small C program that references two types of external symbol, one a variable, one a function. These should produce different fixups with the patched llvm.

I ran it through a standard clang (on my Mac) and then a version of llc with the patch from r307888 in it. Attached are the original C, the compiled llvm IR and final asm, showing that the back end works as expected and produces the correct fixups.

void (*callbackPtr)(short value);
short * myValuePtr;

void externalFunction(short value);
extern short externalConstant;

void bar(char pin, void (*callback)(short v), short * myValue);

void trampoline() {
  bar(3, callbackPtr, myValuePtr);
}

void loadCallbackPtr() {
  callbackPtr = &externalFunction;
}

void loadValuePtr() {
  myValuePtr = &externalConstant;
}

void foo() {
  loadCallbackPtr();
  loadValuePtr();
  trampoline();
}
; ModuleID = 'simple-via-ptr.c'
source_filename = "simple-via-ptr.c"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"

@callbackPtr = common global void (i16)* null, align 8
@myValuePtr = common global i16* null, align 8
@externalConstant = external global i16, align 2

; Function Attrs: nounwind ssp uwtable
define void @trampoline() #0 {
  %1 = load void (i16)*, void (i16)** @callbackPtr, align 8
  %2 = load i16*, i16** @myValuePtr, align 8
  call void @bar(i8 signext 3, void (i16)* %1, i16* %2)
  ret void
}

declare void @bar(i8 signext, void (i16)*, i16*) #1

; Function Attrs: nounwind ssp uwtable
define void @loadCallbackPtr() #0 {
  store void (i16)* @externalFunction, void (i16)** @callbackPtr, align 8
  ret void
}

declare void @externalFunction(i16 signext) #1

; Function Attrs: nounwind ssp uwtable
define void @loadValuePtr() #0 {
  store i16* @externalConstant, i16** @myValuePtr, align 8
  ret void
}

; Function Attrs: nounwind ssp uwtable
define void @foo() #0 {
  call void @loadCallbackPtr()
  call void @loadValuePtr()
  call void @trampoline()
  ret void
}

attributes #0 = { nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 8.1.0 (clang-802.0.42)"}
    .text
    .macosx_version_min 10, 12
    .file   "simple-via-ptr.ll"
    .globl  trampoline
    .p2align    1
    .type   trampoline,@function
trampoline:                             ; @trampoline
; BB#0:
    lds r20, myValuePtr
    lds r21, myValuePtr+1
    lds r22, callbackPtr
    lds r23, callbackPtr+1
    ldi r24, 3
    call    bar
    ret
.Lfunc_end0:
    .size   trampoline, .Lfunc_end0-trampoline

    .globl  loadCallbackPtr
    .p2align    1
    .type   loadCallbackPtr,@function
loadCallbackPtr:                        ; @loadCallbackPtr
; BB#0:
    ldi r24, pm_lo8(externalFunction)
    ldi r25, pm_hi8(externalFunction)
    sts callbackPtr+1, r25
    sts callbackPtr, r24
    ret
.Lfunc_end1:
    .size   loadCallbackPtr, .Lfunc_end1-loadCallbackPtr

    .globl  loadValuePtr
    .p2align    1
    .type   loadValuePtr,@function
loadValuePtr:                           ; @loadValuePtr
; BB#0:
    ldi r24, lo8(externalConstant)
    ldi r25, hi8(externalConstant)
    sts myValuePtr+1, r25
    sts myValuePtr, r24
    ret
.Lfunc_end2:
    .size   loadValuePtr, .Lfunc_end2-loadValuePtr

    .globl  foo
    .p2align    1
    .type   foo,@function
foo:                                    ; @foo
; BB#0:
    call    loadCallbackPtr
    call    loadValuePtr
    call    trampoline
    ret
.Lfunc_end3:
    .size   foo, .Lfunc_end3-foo

    .type   callbackPtr,@object     ; @callbackPtr
    .comm   callbackPtr,2,8
    .type   myValuePtr,@object      ; @myValuePtr
    .comm   myValuePtr,2,8

I will create a reduced version to make a more comprehensive regression test for the patch.

Have you used LLVM bugpoint before? If not, let me know and I can write a quick paragraph explaining a way to automatically minimise the testcase

and I can write a quick paragraph

Or enhance the current one :-)

Once function pointers are correctly placed into program memory (address space 1), which will get fixed in #68, we can then write out the hack introduced in r307888.

FWIW, function pointers have been correctly placed in program memory for a few months now.

Yes! Amazing! Big change on llvm. :))

We can mark this as closed can't we?

avr-rust / rust-legacy-fork

Indirect call address is possibly being loaded incorrectly #65