float16 using 8 bytes rather than 2

andyjbm commented 2 months ago

Hi there,

So using esp8266 with framework v4.2.1 and arduino v3.1.2

I needed an array of float for 1000 values and was running out of RAM. 1000 values was taking up 4kb. So I tried and array of float16 and the results were that took up 8kb! so 8 bytes per float16?

This was the results in a basic sketch with a global float16 array:

#include "float16.cpp"
#include "Arduino.h"

float16 test16[1000];
float test32[1000];

setup(){
  // set some values to make sure the compiler doesn't optimise out the arrays.
  test16[5] = 32;
  test32[4] =32;
};
loop(){};

Then inspecting the project in platformio and looking at the .bss section and how much RAM is allocated to the two arrays.

I've narrowed this down to the printable class that the float16 inherits from. By removing the inheritance and related printto, setdecimal/getdecimal vars and methods I am able to get my float16 array of 1000 into 2kb as I'd hoped.

As a result the library is working perfectly in every other respect. Thanks for sharing your work. ;-)

See here for my mods that acheive this: https://github.com/andyjbm/pwrGenie-V2.0/tree/main/pio/lib/float16

I'm raising this as an issue because I would think a float16 that takes up 8 bytes is a bit of a chocolate teapot when a float32 only takes up 4 bytes, no?

I haven't investigated why the printable class is sucking up 6 bytes per instance as I don't need it right now..

Regards,

Andy.

RobTillaart commented 2 months ago

(added code tags for syntax high lighting)

Thanks for this information and the link - I will at that in the readme.md file as it looks very useful.

At least part of the problem is in

private:
  uint8_t  _decimals = 4;
  uint16_t _value;

which causes every object to use 3 bytes.

I will do some tests on AVR to see its memory use.

To be continued.

RobTillaart commented 2 months ago

Test on AVR


#include "Arduino.h"
#include "float16.h"

float16 test16[100];
float test32[100];

void setup()
{
  Serial.begin(115200);

  Serial.println(sizeof(test16) / sizeof(test16[0]));
  Serial.println(sizeof(test16));
  Serial.println(sizeof(test16[0]));
  Serial.println();

  Serial.println(sizeof(test32) / sizeof(test32[0]));
  Serial.println(sizeof(test32));
  Serial.println(sizeof(test32[0]));
  Serial.println();

  // set some values to make sure the compiler doesn't optimise out the arrays.
  test16[5] = 32;
  test32[4] = 32;
};

void loop()
{
};

Output shows the float16 is larger than the float32.

As the ESP8266 is a 32 bit board, it round memory to 32 bit boundaries resulting in 8 bytes.

RobTillaart commented 2 months ago

Removing the Printable interface from the class

FLOAT16
100
200
2

FLOAT32
100
400
4

So confirmed your observation.

haven't investigated why the printable class is sucking up 6 bytes per instance as I don't need it right now..

For me it was useful during tests, so one option is to remove it from the class. That said it still feels useful to be able to print it, I will investigate coming days.

RobTillaart commented 2 months ago

Another option is to explicit convert it when printing.

  Serial.println(test16[5].toDouble(), 3);

That is what the PrintTo(), did too under the hood

size_t float16::printTo(Print& p) const
{
  double d = this->f16tof32(_value);
  return p.print(d, _decimals);
}

For ease of use one might add toFloat() which works similar as toDouble().

double float16::toDouble() const
{
  return f16tof32(_value);
}

float float16::toFloat() const
{
  return f16tof32(_value);
}

Serial.println(test16[5].toDouble(), 3);
Serial.println(test16[5].toFloat(), 3);

So removing the Printable interface does still allow printing albeit a tiny bit more work. Lets think further....

RobTillaart commented 2 months ago

This is another interesting way to print a float16.

String float16::toString(uint8_t decimals) const
{
  return String(f16tof32(_value), decimals);
}

The test sketch also drops 2800 bytes in size when the Printable interface is left out.

Feels like a new 0.3.0 release is coming (breaking change adding toFloat() and toString() and fixes in example code.)

RobTillaart commented 2 months ago

Note for myself: fix float16ext library too.

=> issue created

RobTillaart commented 2 months ago

@andyjbm Develop branch stable, PR created.

andyjbm commented 2 months ago

Hi Rob, Quick work there! Some thoughts before I call it a night. In no particular order.

  uint8_t  _decimals = 4;
  uint16_t _value;

Yes I saw that and firstly I tried removing references to _decimals which, by your calcs should make float16 fit into 4 bytes yet I was still seeing 8 bytes per instance.

I tried wrapping appropriate parts in pragma pack yet that still made no difference.

I can understand how the esp8266 might push 5 bytes to an 8 bytes boundary but I didn't understand why I could not get float16 to 4 bytes with the _decimal var removed. I assumed there was more going on with the printable class.

Interesting that float(32) only uses 4 bytes and is printable. How does float do that without an extra 2 bytes for each variable?

I had pondered if there was a way of making the printTo call a static function to the class so that printable was not instantiated for each incarnation of a float16? Maybe a print "wrapper" which could inherit a float16 array? Sounds a clumsy solution though now I've typed it.

I have more in my head which may help but I need to give it some thought before I ramble further.

Regards,

Andy.

RobTillaart commented 2 months ago

Interesting that float(32) only uses 4 bytes and is printable.

That is a built in type with another level of support from compiler and libraries.

I assumed there was more going on with the printable class.

The code of the printable class is minimal, only a virtual function

class Printable
{
  public:
    virtual size_t printTo(Print& p) const = 0;
};

(assumption ahead) It might be that implementing an interface causes an extra entry in the vtable to hold the PrintTo() function. (end assumption)

An option I have thought about is to put all private data in a struct but I did not test that as it would look strange to have all that code changed to use the struct.

...
private:
  struct
  {
    uint8_t  _decimals = 4; 
    uint16_t _value; 
  } zz;
}

So for now I think the removal of printable seems to the best solution (little loss of usability, great gain of footprint)

Today I will try to finish the 0.3.0 version and upgrade the float16ext class too.

RobTillaart commented 2 months ago

@andyjbm Released 0.3.0, working on float16ext is almost done.

Again thanks for reporting this issue, imho the library made an important step forward.

andyjbm commented 2 months ago

Hi Rob

So for now I think the removal of printable seems to the best solution (little loss of usability, great gain of footprint)

Yes I would agree completely.

The attraction of a float in 2 bytes for my huge array was the motivator to go searching for a solution only to discover that 1/2 precision was already a thing and you had already built an implementation.

It would get real messy real quick if you were to try and make float16 array aware. And the struct with _decimals would still push the byte count to three per element unless you could implement a static _decimals across the class? All sounds a bit unnecessary to me. And quickly loosing advantage of float16 over float32.

My application is converting decibels to linear numbers for Leq average calculations which requires a wide number range but not much precision. Three significant figures is more than enough. The sum for the average is held in a single float32 anyway.

So for me the power of float16 is in its simplicity & small RAM footprint while easily achieving over 4 sig fig of precision when converted back to the logarithmic domain.

Having to type ".toFloat()" if I want to print or convert to string is a trivial drawback in my opinion.

Thanks, Andy.

andyjbm commented 2 months ago

@andyjbm Released 0.3.0, working on float16ext is almost done.

Again thanks for reporting this issue, imho the library made an important step forward.

That's fantastic Rob,

Thanks for sharing your work.

Andy.

RobTillaart / float16

float16 using 8 bytes rather than 2 #12