Open ewwaller opened 3 years ago
I can confirm the issue. I think the issue is that maskSet
, maskClear
and i
are all the constant 8 and the optimizer recognizes this and puts them all in the same register. That would be fine, if the values were indeed treated as constants. But while maskSet
and maskClear
are constant (and merging them is a good idea), i
is not. This leads to the miscompilation.
Right now the inline assembly format does not support marking certain input values as clobbered. I'm not sure whether I really want to add such support, AsmFull
is already pretty complicated. Instead, it might be better to move these snippets of inline assembly to C (usable through CGo) so that all the features of GCC style inline assembly can be used (including marking some inputs as clobbered).
CGo might sound expensive, but in TinyGo it's just a regular function call and I have plans of optimizing it further so that it is optimized together with Go code (and can be inlined, etc, see #1707 for some related work).
Also, I just want to say thank you for this super helpful bug report. It's a very clear description of the problem, with a reduced code sample even. It makes finding the bug a whole lot easier :)
Thanks, and my pleasure. I am new to go, but have spent my professional career with microcontrollers and embedded systems. I am new to go, but am increasingly excited about tinygo.
Edit: Never mind, Split the function to a *.c file, created a .h file, and included the .h instead of the function. That linked, and is cleaner
Original post: :
I have been playing with this, and have hit a roadblock with CGo and could use some help getting back on the tracks.
Here is my code for the ws2812 driver:
// +build atsamd51
package ws2812
// This file implements the WS2812 protocol for 120MHz Cortex-M4
// microcontrollers.
// Note: This implementation does not work with tinygo 0.9.0 or older.
import (
"device/arm"
)
/*
#include <stdint.h>
void writebyte( unsigned char c, \
uint32_t maskSet, \
uint32_t maskClear, \
uint32_t* portSet, \
uint32_t* portClear){
int cnt;
int i;
for (i = 7 ; i != 0xff ; i--) {
if ((1<<i) & c) {
cnt=4;
while (cnt--);
*portClear = maskClear;
cnt=8;
while (cnt--);
} else {
cnt=8;
while (cnt--);
*portClear = maskClear;
cnt=4;
while (cnt--);
}
*portSet = maskSet;
*portClear = maskClear;
}
}
*/
import "C"
// Send a single byte using the WS2812 protocol.
func (d Device) WriteByte(c byte) error {
// For the Cortex-M4 at 120MHz
portSet, maskSet := d.Pin.PortMaskSet()
portClear, maskClear := d.Pin.PortMaskClear()
// See:
// https://wp.josh.com/2014/05/13/ws2812-neopixels-are-not-so-finicky-once-you-get-to-know-them/
// T0H: 32-34 cycles or 266.67ns - 283.33ns
// T0L: 101-103 cycles or 841.67ns - 858.33ns
// +: 133-137 cycles or 1108.33ns - 1141.67ns
// T1H: 73-75 cycles or 608.33ns - 625.00ns
// T1L: 58-60 cycles or 483.33ns - 500.00ns
// +: 131-135 cycles or 1091.67ns - 1125.00ns
mask := arm.DisableInterrupts()
C.writebyte(
C.uchar(c),
C.uint32_t(maskSet),
C.uint32_t(maskClear),
portSet,
portClear)
arm.EnableInterrupts(mask)
return nil
}
but, I get this when I build it:
ld.lld: error: undefined symbol: writebyte
>>> referenced by ws2812_m4_120m.go:58 (/home/ewaller/devel/go/src/ewaller.local/featherBlink/drivers/ws2812/ws2812_m4_120m.go:58)
>>> /tmp/tinygo887064602/main.o:((featherBlink/drivers/ws2812.Device).WriteByte)
error: failed to link /tmp/tinygo887064602/main: exit status 1
ewaller@odin/home/ewaller/devel/go/src/ewaller.local/featherBlink[1] %
It compiles, but dies at the linker. Not sure where to go from here.
I think this is a bug in the way inline assembly is processed. I have some code that works properly with the -opt 1 flag, but does not work with the default optimization. The following minimized example includes code from the ws2812 driver. Running on a Adafruit Feather M4, this code displays increasing brightness of red, then green, then blue on the smart LED on that board (AKA Neopixel)
This is done by transmitting three bytes, each encoded as 8 pulses (for a total of 24 pulses) to the device. A narrow pulse codes a 0, and a wide pulse codes a 1
When it works (compiled with -opt 1) , the (correct) output waveform looks like this:
But, with no optimization directive, it looks like this:
In this case, there are initial pulses for each of the three bytes, but the output never goes high for subsequent bits. The time between bytes is the same as had been for the working example.
The tinygo go is :
The inline code uses a register to count of the 8 pulses in a register which is preset to 8 by the wrapping code. The wrapping code also figues out the address of the register to manipulate to control the output and the data that need to be written. All of those are passed into the inline code using the map clause. The wrapping code calls machine functions to get the port information.
Looking at the disassembled code of the working code (-opt 1) for WriteByte we get:
Whereas the optimized, non-working code :
Note that in this case, the optimizer figured out it did not need to call machine as the information was available at compile time, so it used constants. The problem appears to be, in this second case, it is attempting to use the ip register for both the count down from 8 to 0 [ see 0x5daa - 0x5dae], AND as the pointer to the register that contain the I/O register [see 0x5cf8 and 0x5d4a]