espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.32k stars 7.36k forks source link

The "word" data type seems to be a 32-bit quantity on ESP32 #1745

Closed Paraphraser closed 4 years ago

Paraphraser commented 6 years ago

Hardware:

Board: TTGO LoRa32-OLED V1 Board: NodeMCU-32S Board: Heltec_WIFI_LoRa_32 Board: Node32s

Core Installation/update date: Clean install 1.0.0 via https://dl.espressif.com/dl/package_esp32_index.json IDE name: Arduino IDE 1.8.5 (macOS Sierra) Flash Frequency: 80Mhz Upload Speed: 921600

Description:

According to:

https://www.arduino.cc/reference/en/language/variables/data-types/word/

the "word" data type should always be a 16-bit quantity. This assertion appears to be true for all boards I have tested, except those with ESP32 chips where the "word" data type always seems to be a 32-bit quantity.

Running the attached test sketch on a selection of boards reveals the following for sizeof(word):

ESP32: 4 ESP8266: 2 Mega2560: 2 Uno: 2

Unless I have misunderstood some fundamental concept, I do not see how a "word" can ever be anything other than a 16-bit quantity so I do not believe the ESP32 core is behaving correctly.

Sketch:

#include <Arduino.h>

void show(const char * tag, int l) {
    Serial.print(tag); Serial.print("\t"); Serial.println(l);
}

void setup() {

    Serial.begin(115200); delay(200); Serial.println();

    show("              bool",sizeof(bool));
    show("           boolean",sizeof(boolean));
    show("              byte",sizeof(byte));
    show("              char",sizeof(char));
    show("     unsigned char",sizeof(unsigned char));
    show("           uint8_t",sizeof(uint8_t));

    show("             short",sizeof(short));
    show("          uint16_t",sizeof(uint16_t));
    show("              word",sizeof(word));

    show("               int",sizeof(int));
    show("      unsigned int",sizeof(unsigned int));
    show("            size_t",sizeof(size_t));

    show("             float",sizeof(float));
    show("              long",sizeof(long));
    show("     unsigned long",sizeof(unsigned long));
    show("          uint32_t",sizeof(uint32_t));

    show("            double",sizeof(double));

    show("         long long",sizeof(long long));
    show("unsigned long long",sizeof(unsigned long long));
    show("          uint64_t",sizeof(uint64_t));

}

void loop() {}

Debug Messages:

  1. Enable Core debug level: Debug.
  2. Upload sketch.
  3. Press reset button. Note that enabling Debug has no discernable effect (the content of the Serial Monitor window is the same, irrespective of Core Debug Level option).
ets Jun  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
load:0x3fff001c,len:808
load:0x40078000,len:6084
load:0x40080000,len:6696
entry 0x400802e4

              bool  1
           boolean  1
              byte  1
              char  1
     unsigned char  1
           uint8_t  1
             short  2
          uint16_t  2
              word  4
               int  4
      unsigned int  4
            size_t  4
             float  4
              long  4
     unsigned long  4
          uint32_t  4
            double  8
         long long  8
unsigned long long  8
          uint64_t  8

DataTypes.pdf

Paraphraser commented 6 years ago

I believe this will turn out to be the nub of the problem:

~/Library/Arduino15/packages/arduino/hardware/avr/1.6.21/cores/arduino/Arduino.h

    typedef unsigned int word;

~/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.2/cores/esp8266/Arduino.h

    typedef uint16_t word;

~/Library/Arduino15/packages/esp32/hardware/esp32/1.0.0/cores/esp32/Arduino.h

    typedef unsigned int word;

On the Arduino, an "unsigned int" is a 16-bit quantity so a "word" is also a 16-bit quantity.

On the ESP8266 and ESP32, an "unsigned int" is a 32-bit quantity. The header file for the ESP8266 corrects for the change in the size of integers by equating "word" with "uint16_t" but no such adjustment has been made for the ESP32.

G6EJD commented 5 years ago

The register width of early cpu’s was 8-bit and a word would then be 8-bits. When the registers are 16-bit wide the word size is 16-bit and so on, an ESP32 has a 32-bit word. I think it’s wrong to adjust the compiler data types to some standard (16-bit) when it’s actually a function of the board/cpu type.

Paraphraser commented 5 years ago

In some ways I agree with you. I wrote my first line of code in 1975 on a Control Data 6400 mainframe where a "byte" was six bits, a "word" 60 bits and an "address" 18 bits. But, even there, that only applied to CPU programming while the peripheral processors manipulated 12-bit words in a 12-bit address space. A few years later when I started playing with Z80s, it was an 8-bit byte and a 16-bit word with the architecture supporting both 8- and 16-bit math operations. Next cab off the rank for me was an 8080 where "endianness" suddenly became a worry...

So, yes, for anyone who has experience with a lot of different architectures, the number of bits in a byte, word or other construct not otherwise qualified with an adjective (eg "64-bit word") quite naturally depends on the physical characteristics of the machine you are playing with at the moment. As a grizzled-geek-from-way-back I have an awful lot of sympathy with this "first, understand the hardware you're dealing with" approach because it's utterly familiar.

Where this idea of letting the hardware du jour be your guide comes unstuck is when you are faced with a definition like the one at this URL:

https://www.arduino.cc/reference/en/language/variables/data-types/word/

To me, that definition is saying «in the Arduino world, a "word" is always a 16-bit quantity, no ifs, ands or buts.» Then you also see definitions that rely on that assumption like:

https://www.arduino.cc/reference/en/language/functions/bits-and-bytes/highbyte/

which talks about "the high-order (leftmost) byte of a word (or the second lowest byte of a larger data type)". When I read things like that I receive the take-home message that the concept of a "word" as a 16-bit quantity is not machine-specific in the Arduino world but is intentionally defined at a higher level of abstraction.

Quite frankly I'm easy either way. We can adopt the Arduino definition, everywhere, or we can be machine-specific, everywhere. But what we have at the moment is neither fish nor fowl. We have the "arduino" and "esp32" following the hardware (resulting in 16- and 32-bit words, respectively) and "esp8266" (both 2.4.2 and 2.5.0-beta2) following the documentation (or so I presume) and tying "word" to "uint16_t".

On balance, given that the worldwide community of Makers comprises everything from people who have no intention of developing their programming skills beyond the needs of the moment, through to people with lifetimes of programming experience to draw upon, I think I prefer an approach that fails safe, by which I mean does not "surprise" any Maker at the less-experienced end of the spectrum who happens to use a variety of boards. To me, a doctrine of "no surprises" means that, for as long as the Arduino reference says "a word is a 16-bit quantity", then everything that includes Arduino in its ancestry should stick to that.

Longer term, I prefer Apple's approach of handling conundrums like this with deprecations and compiler warnings. If, every time a variable or parameter was declared as "word", the compiler spat out something along the lines of, "the 'word' type is unsafe - you really should be using 'uint16_t' if that is what you mean", then everyone would get the hint, irrespective of his/her level of experience.

G6EJD commented 5 years ago

We clearly have similar backgrounds and understanding of the correct definition of ‘word’ in this context.

It’s clear the Arduino world; probably for reasons of code portability across platforms, have chosen 16-bits as a word, which is acceptable, but they should clearly state that with (now) so many different cpu’s supported. Especially so, if their ethos is about learning and their approach is counter to what users will be taught academically.

The Apple approach to this issue would be the best solution.

Thank you for your comprehensive response.

nicechocolate commented 5 years ago

I recall the definition of a word as being the processor register size, so for an ESP32, word is 32 bits.
Some years ago, I graduated from 8-bit processors to the DEC PDP-11 16-bit processors, so a word WAS always 16 bits. However, I recall that the previous generation, the PDP-8 was a 12-bit processor, but we still referred to a word as 16 bits. Interestingly, it used 8-bit RAM, even for the 12-bit instruction codes, so you had to remember that the next byte should be thought of as two nibbles and that the nibble that wasn't instruction wasn't usable for data, either. As Paraphraser says, if in doubt be explicit. Thank you for tolerating my memoirs.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

lbernstone commented 5 years ago

Why did you hijack this thread? Your issue has no relation to the "word" type.

kbickham commented 5 years ago

Why did you hijack this thread? Your issue has no relation to the "word" type.

I'll delete my posts, this was not my intent. Porting software from an 8 bit processor to a 32 bit processor, it's not irrelevant. I was wrong . I thought there was a problem in the data types being passed.

stale[bot] commented 4 years ago

[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

[STALE_DEL] This stale issue has been automatically closed. Thank you for your contributions.