arduino / ArduinoCore-avr

The Official Arduino AVR core
https://www.arduino.cc
1.22k stars 1.04k forks source link

64K problems: pins_arduino.h _PGM arrays pushed out of range of pgm_read_byte() #174

Open WestfW opened 10 years ago

WestfW commented 10 years ago

A simple program (say, Blink) that includes more than 64k of pgmspace data (for MEGA, MEGA2560, MEGA ADK, various 1284 boards) will fail to work correctly. One of the reasons is that Arduino.h includes macro definitions like:

define digitalPinToPort(P) ( pgm_read_byte( digital_pin_to_port_PGM + (P) ) )

Due to link order, the digital_pin_to_port_PGM[] array will be pushed AFTER the explicitly defined pgmspace variables, and it will no longer be readable by pgm_read_byte.

Using pgm_read_byte_far() seems like overkill. It turns out that apparently gcc has a similar problem with pgmspace variables that IT uses, because the default linker map includes two entries for progmem data:

    /* For data that needs to reside in the lower 64k of progmem.  */
    *(.progmem.gcc*)
    *(.progmem*)

This means that MEGA and etc can be fixed by a relative simple patch to their pins_arduino.h file, putting the pin tables in section .progmem.gcc.arduinocore instead of the normal .progmem (as per the attached diff file)

WestfW commented 10 years ago

(Um. Can I not attach source and diff files to an issue?)

matthijskooijman commented 10 years ago

Probably not - I guess you could open a pullrequest, or include the diff in a comment (indent it by for spacs or put ``` before and after it to make it display as code).

As for the fix you suggest, I'm wondering why it works. AFAIU, any new sections introduced have to be manually listed in the linker script? Though perhaps due to -fdata-sections each variable already gets its own section and they are auto-joined? If so, I believe that this fix works because "arduinocore" sorts early?

WestfW commented 10 years ago

The linker script already has *(progmem.gcc*) in the appropriate place. The section names from -fdata-sections take the form of ".mainsection.variablename", so almost all of the linker script sections are wildcarded. (Hmm. I wonder why the wildcards didn't come out in my first post. I'm pretty sure they were in the COPY...)

WestfW commented 10 years ago

Grr. Removed them again. cursed markdown language! (edited to fix?)

matthijskooijman commented 10 years ago

It seems that github/markdown is interpreting your asterisks as meaning italic text. You can probably escape them with *, or quite with `.

Thanks for clarifying, I can see why your fix works. I'm wondering if "arduinocore" is the best approach though - it sorts low, but if a user declares a big "aa" PROGMEM variable, things still break. I'm not sure what the rules for section names are, but if they're the same as variable names (start with a latter, contain letters, numbers and underscores IIRC), something like .progmem.gcc.A00_arduinocore might be good?

WestfW commented 10 years ago
--- pins_arduino.h  2014/08/04 06:42:23 1.1
+++ pins_arduino.h  2014/08/04 06:43:51
@@ -84,8 +84,9 @@
                                 0 ) ) ) ) ) )

 #ifdef ARDUINO_MAIN
+#define MYPROGMEM  __attribute__((section(".progmem.gcc.arduinocore")))

-const uint16_t PROGMEM port_to_mode_PGM[] = {
+const uint16_t MYPROGMEM port_to_mode_PGM[] = {
    NOT_A_PORT,
    (uint16_t) &DDRA,
    (uint16_t) &DDRB,
@@ -101,7 +102,7 @@
    (uint16_t) &DDRL,
 };

-const uint16_t PROGMEM port_to_output_PGM[] = {
+const uint16_t MYPROGMEM port_to_output_PGM[] = {
    NOT_A_PORT,
    (uint16_t) &PORTA,
    (uint16_t) &PORTB,
@@ -117,7 +118,7 @@
    (uint16_t) &PORTL,
 };

-const uint16_t PROGMEM port_to_input_PGM[] = {
+const uint16_t MYPROGMEM port_to_input_PGM[] = {
    NOT_A_PIN,
    (uint16_t) &PINA,
    (uint16_t) &PINB,
@@ -133,7 +134,7 @@
    (uint16_t) &PINL,
 };

-const uint8_t PROGMEM digital_pin_to_port_PGM[] = {
+const uint8_t MYPROGMEM digital_pin_to_port_PGM[] = {
    // PORTLIST     
    // -------------------------------------------      
    PE  , // PE 0 ** 0 ** USART0_RX 
@@ -208,7 +209,7 @@
    PK  , // PK 7 ** 69 ** A15  
 };

-const uint8_t PROGMEM digital_pin_to_bit_mask_PGM[] = {
+const uint8_t MYPROGMEM digital_pin_to_bit_mask_PGM[] = {
    // PIN IN PORT      
    // -------------------------------------------      
    _BV( 0 )    , // PE 0 ** 0 ** USART0_RX 
@@ -283,7 +284,7 @@
    _BV( 7 )    , // PK 7 ** 69 ** A15  
 };

-const uint8_t PROGMEM digital_pin_to_timer_PGM[] = {
+const uint8_t MYPROGMEM digital_pin_to_timer_PGM[] = {
    // TIMERS       
    // -------------------------------------------      
    NOT_ON_TIMER    , // PE 0 ** 0 ** USART0_RX 
WestfW commented 10 years ago

I'm pretty sure that the progmem.gcc.* will always sort before the progmem.variablename values created by normal user pgmspace statements. Otherwise it wouldn't work for gcc internals, either.

matthijskooijman commented 10 years ago

Oh, right. I was assuming that variables would get progmem.gcc.variablename, which is apparently wrong. I'm wondering if it's ok to 'invade' the progmem.gcc namespace like this, but I guess it will work and as long as we don't push > 64k of data in it, it should be ok :-)

Not sure I like the "MYPROGMEM" name though. Wouldn't GCCPROGMEM be better? In either case, there should be a fat comment explaining why this is needed and why this works :-)

oqibidipo commented 7 years ago

You must also modify these lines in Arduino.h to make it work

extern const uint16_t PROGMEM port_to_mode_PGM[];
extern const uint16_t PROGMEM port_to_input_PGM[];
extern const uint16_t PROGMEM port_to_output_PGM[];

extern const uint8_t PROGMEM digital_pin_to_port_PGM[];
// extern const uint8_t PROGMEM digital_pin_to_bit_PGM[];
extern const uint8_t PROGMEM digital_pin_to_bit_mask_PGM[];
extern const uint8_t PROGMEM digital_pin_to_timer_PGM[];
geez0x1 commented 7 years ago

Today we spent most of the day debugging why our Arduino immediately crashed when increasing data use in PROGMEM, even when that data was not being used. Spent ages trying to find problems with 32-bit pointers, etc. What made matters worse is that at some point disabling optimisations (-O0 instead of -Os) made it partially work, which we tried because the documentation for pgm_get_far_address suggests:

'var' has to be resolved at linking time as an existing symbol, i.e, a simple type variable name, an array name (not an indexed element of the array, if the index is a constant the compiler does not complain but fails to get the address if optimization is enabled), a struct name or a struct field name, a function identifier, a linker defined identifier,...

http://www.nongnu.org/avr-libc/user-manual/group__avr__pgmspace.html#ga8f4a87d8740f570871d7e041750efed3

Turns out the problem was the one mentioned in this thread. Disabling optimisations just resulted in different code which somehow made something work at some point. Very confusing.

The patches by WestfW (needs to be done manually though) and oqibidipo appear to completely solve the problem.

As of this moment these patches are not implemented in the Arduino mainline code. My Debian packages may be slightly outdated(?), but the other machine is running the (most) recent packages with AVR 1.6.18 etc.

Is there any particular reason these patches have not been mainlined? Should we file a pull request?

information-security commented 7 years ago

Hi,

Any update on this? I tested the pull request. It somehow fixed the issue but some strange behaviours came up instead. I am not sure if it was caused by the change or it is from my code!

geez0x1 commented 7 years ago

Can you be more specific as to what issues you found?

information-security commented 7 years ago

I have two large arrays as well as a medium array: extern imagedatatype gImage_img_header[]; extern imagedatatype gImage_img_sc1[]; extern imagedatatype gImage_img_sc2[]; As their names imply, these arrays contain image data. header is shown at top of TFT display, sc1 is shown below the header and sc2 is shown at bottom of the screen. When I reduce arrays' size (By reducing image quality) everything is shown as expected but when I switch to using large arrays considering the fact that I have applied the patch, sc1 and sc2 are fine but header is not shown and a cropped part of sc2 or sc1 (I can't recall which) is shown instead. Note that without the patch sketch doesn't work at all. Here is my sketch (excluded image data because they were too large):

#include <UTFT.h>
extern uint8_t SmallFont[];
UTFT myGLCD(HX8357B,38,39,40,41);
#define imagedatatype  unsigned int
extern imagedatatype gImage_img_header[]; // size: 14400 bytes
extern imagedatatype gImage_img_sc1[]; // size: 31200 bytes
extern imagedatatype gImage_img_sc2[]; // size: 31200 bytes

void setup() {
  myGLCD.InitLCD();
  myGLCD.setFont(SmallFont);
  myGLCD.clrScr();
}

void loop() {
  myGLCD.fillScr(255, 255, 255);
  myGLCD.setColor(255, 255, 255);
  myGLCD.drawBitmap(0, 0, 240, 30, gImage_img_header, 2);
  myGLCD.drawBitmap(0, 60, 240, 65, gImage_img_sc1, 2);
  myGLCD.drawBitmap(0, 190, 240, 65, gImage_img_sc2, 2);
  delay(5000);
}
geez0x1 commented 7 years ago

In your code I don't see any mention of PROGMEM, i.e. using the PGM memory. I'm not sure what's causing the sketch to work once you apply the patch, but it appears to me you're simply filling up the RAM with the larger data, causing the sketch to crash. The IDE should show you a message how much of flash and RAM (global variables) you are using.

information-security commented 7 years ago

I don't think Mega2560 or any other arduino has that much ram available (31200 + 31200 + 14400). gImage_img_header, gImage_img_cs1 and gImage_img_cs2 are defined extern because they are placed into another .c file and I mentioned that I am excluding them because they are just large arrays. To find out how they are defined have a look below:

img_header.c

#if defined(__AVR__)
    #include <avr/pgmspace.h>
#elif defined(__PIC32MX__)
    #define PROGMEM
#elif defined(__arm__)
    #define PROGMEM
#endif

const unsigned char gImage_img_header[14400] PROGMEM = { 0X00,0X10,0XF0
 .............
};
geez0x1 commented 7 years ago

Ah, fair point. Can I ask which board you are using? Maybe I missed one (intended to patch only the files for boards that have >64K PGM available).

Another possibility is that there's other things that need to be patched as well, since more things are using PROGMEM:

./hardware/arduino/avr/cores/arduino/Tone.cpp:96:const uint8_t PROGMEM tone_pin_to_timer_PGM[] = { 2 /*, 3, 4, 5, 1, 0 */ };
./hardware/arduino/avr/cores/arduino/Tone.cpp:104:const uint8_t PROGMEM tone_pin_to_timer_PGM[] = { 2 /*, 1 */ };
./hardware/arduino/avr/cores/arduino/Tone.cpp:112:const uint8_t PROGMEM tone_pin_to_timer_PGM[] = { 3 /*, 1 */ };
./hardware/arduino/avr/cores/arduino/Tone.cpp:121:const uint8_t PROGMEM tone_pin_to_timer_PGM[] = { 2 /*, 1, 0 */ };
./hardware/arduino/avr/cores/arduino/CDC.cpp:41:extern const CDCDescriptor _cdcInterface PROGMEM;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:34:extern const u16 STRING_LANGUAGE[] PROGMEM;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:35:extern const u8 STRING_PRODUCT[] PROGMEM;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:36:extern const u8 STRING_MANUFACTURER[] PROGMEM;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:37:extern const DeviceDescriptor USB_DeviceDescriptorIAD PROGMEM;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:50:const u8 STRING_PRODUCT[] PROGMEM = USB_PRODUCT;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:67:const u8 STRING_MANUFACTURER[] PROGMEM = USB_MANUFACTURER;
./hardware/arduino/avr/cores/arduino/USBCore.cpp:427:// Send a USB descriptor string. The string is stored in PROGMEM as a
./libraries/TFT/src/utility/Adafruit_ST7735.cpp:121:// stored in PROGMEM.  The table may look bulky, but that's mostly the
./libraries/TFT/src/utility/Adafruit_ST7735.cpp:125:PROGMEM const static unsigned char
./libraries/TFT/src/utility/Adafruit_ST7735.cpp:317:// a series of LCD commands stored in PROGMEM byte array.
./libraries/TFT/src/utility/glcdfont.c:11:static const unsigned char  font[] PROGMEM = {
./libraries/GSM/src/GSM3ShieldV1MultiClientProvider.cpp:37:const char _command_MultiQISRVC[] PROGMEM = "AT+QISRVC=";
./libraries/GSM/src/GSM3ShieldV1ModemCore.h:185:        /** Generates a generic AT command request from PROGMEM buffer
./libraries/GSM/src/GSM3ShieldV1BaseProvider.h:53:  /** This function locates strings from PROGMEM in the buffer
./libraries/GSM/src/GSM3SoftSerial.cpp:63:static const DELAY_TABLE PROGMEM table[] = 
./libraries/GSM/src/GSM3SoftSerial.cpp:84:static const DELAY_TABLE table[] PROGMEM = 
./libraries/GSM/src/GSM3SoftSerial.cpp:108:static const DELAY_TABLE PROGMEM table[] =
./libraries/GSM/src/GSM3ShieldV1MultiServerProvider.cpp:40:const char _command_QILOCIP[] PROGMEM = "AT+QILOCIP";
./libraries/GSM/src/GSM3ShieldV1AccessProvider.cpp:42:const char _command_AT[] PROGMEM = "AT";
./libraries/GSM/src/GSM3ShieldV1AccessProvider.cpp:43:const char _command_CGREG[] PROGMEM = "AT+CGREG?";
./libraries/GSM/src/GSM3ShieldV1DataNetworkProvider.cpp:37:const char _command_CGATT[] PROGMEM = "AT+CGATT=";
./libraries/GSM/src/GSM3ShieldV1DataNetworkProvider.cpp:38:const char _command_SEPARATOR[] PROGMEM = "\",\"";

I'm afraid I'm not knowledgeable enough on this, though.

information-security commented 7 years ago

I am using Mega2560 R3.

geez0x1 commented 7 years ago

The patch I submitted has patches for mega: ./hardware/arduino/avr/variants/mega/. So in that case I'm afraid I don't know what's causing your issues. Maybe @WestfW has ideas.

themobydisk commented 5 years ago

As an alternative, is there a way I can label my arrays so that they will go after the PROGMEM arrays? Like __attribute__((section(".progmem.gcc.zzzzzzz")))? If I wasn't using Arduino Studio, couldn't I could put something on the linker command-line to state that my zzzzzzz section must start at 0x10000? FYI: Apparently the Ethernet library suffers from this problem too.

WestfW commented 3 years ago

I don't think that "polluting" the gcc space is a significant problem, as long as the amount of PROGMEM used by Arduino is small, and I don't even know offhand what parts of gcc or avr-libc use PROGMEM (but they're also obviously small.)

Since Arduino has "matured" to the point of having custom linker scripts (eg for SAMD), a "correct" solution might involve using a linker script that defines arduino's own section of flash, in between the progmem.gcc and the progmem* segments.