graphitemaster / incbin

Include binary files in C/C++
The Unlicense
938 stars 87 forks source link

Bunch of (I hope) improvements #53

Closed AlexIII closed 2 years ago

AlexIII commented 2 years ago

Hi!

I've made an Adurino library based on your project.

It required a few changes to be made to incbin.h which I think improve versatility and may be of use. All change are non-breaking and backward-compatible and I tried to follow the original code style closely.

Explanation of every commit

bbc30009 - Add an option to specify different sections for data and its size

Arduino is based on AVR MCU, which has true harvard architecture. Its program memory cannot be directly read from the program, it requires special instructions. Due to this limitation it is very inconvenient to store the size of the included binary data in ROM. The size variable should be stored in RAM, hence it should be in different section. The commit adds an option to specify section for data size variable by defining INCBIN_SIZE_OUTPUT_SECTION macro. If undefined, it defaults to the value of INCBIN_OUTPUT_SECTION.

2e4c3222 - Add terminating NULL at the end of the data (optionallny)

When including text files, it is convenient to treat them as a c-string. The problem is that c-string should be NULL-terminated. This commit adds an option to add NULL byte at the end of the included data by defining INCBIN_APPEND_TERMINATING_NULL. Disabled by default.

Please check if I did ok there. I have a feeling this change may mess-up the alignment or something. I admit I don't fully understand the code here. For example, I don't understand why you append .byte 1 at the end of the data.

566be487 - Add an option to specify typename for data

Adds an option to specify the resulting pointer type of the included data. Just a convenience feature that removes the need of the cast every time you use the included data. Adds two new macros INCBIN_PTR_TYPE(NAME, FILENAME, DATA_PTR_TYPE) and INCBIN_EXTERN_PTR_TYPE(NAME, DATA_PTR_TYPE) with one additional argument alongside the existing.

graphitemaster commented 2 years ago

Hey Alexill. This looks really good. I have some suggestions before merging the PR.

graphitemaster commented 2 years ago

I've gone ahead and implemented all three of these in commit

https://github.com/graphitemaster/incbin/commit/c6a9227d57512339ab1b68d2e8215f7ed56b7a9e

You may now use INCTXT to include text, and INCBIN / INCBIN_EXTERN with an optional first argument type to change type. INCTXT always defaults to char (as it should) and adds an implicit NUL byte. The default is still unsigned char for binary.

You can override DATA and SIZE sections independently of one another with INCBIN_OUTPUT_DATA_SECTION and INCBIN_OUTPUT_SIZE_SECTION, or together at once with INCBIN_OUTPUT_SECTION like discussed in my above comment.

Thanks for the suggestions. I'm sorry I didn't use your PR directly. I think you'll find this is a lot cleaner and has a lot more options :)

AlexIII commented 2 years ago

@graphitemaster I generally agree with what you've said, great work!

A few comments, if I may

On typing

INCTXT always defaults to char (as it should)

Ah, you see, it's on "normal" architectures (with common address space) a string is expected to be const char*. But on AVR there's no way to tell if the data is in ROM or in RAM simply by looking at the pointer (ROM and RAM address spaces overlap), and they require drastically different methods to works with. So Arduino implemented a little trick, casting all strings in ROM to dummy type const __FlashStringHelper* instead of const char*. All functions that work with strings have overloads for both types, so now the code can differentiate between the two.

In my lib I can just use INCBIN_COMMON() to re-define INCTXT and set the desired type, no problem, it will be concealed from the user. But those who will use your library directly may find it convenient to have an overload of INCTXT with the type argument though, as writing something like INCBIN_BYTE "0\n" in the last argument of INCBIN_COMMON() in user-code seems like a leakage of implementation details to me.

On terminating null

In my implementation the null byte wasn't included in the size count as to show character count like strlen() does. But I don't think it's necessary better in all cases, both ways have their arguments.

And a question

Could you please explain to me why appending the byte with the value of 1 at the end of the data?

INCBIN_BYTE "1\n" \

I couldn't figure it out, and it just bus me at this point =)

AlexIII commented 2 years ago

Also, I suggest adding INCTXT_EXTERN to complement INCTXT same way as INCBIN / INCBIN_EXTERN

graphitemaster commented 2 years ago

INCTXT_EXTERN exists. That __FlashStringHelper* thing is interesting. Do you have some resources on that I can look into, I'm very curious. I think I can add an overload for INCTXT like the others to accommodate it.

graphitemaster commented 2 years ago

To answer your question about that value. It's just a dummy value that needed to be something. There is an alignment of 1 for that end object after the data that is used to calculate the length, since putting an end variable after it may have padding bytes before it (would use same alignment set for the data), thus computing the wrong size. So an object is needed of some form. Originally the code just used a label but that broke when linked with LTO. The actual value of 1 doesn't particularly matter, it can be any byte value other than 0 since for some reason zero values when compiled with -fdata-sections tend to get reordered into the bss section on some toolchains and breaks the computation as well.

graphitemaster commented 2 years ago

It's pretty late here. I'll look into making the INCTXT interface the same as INCBIN so you can overload the type tomorrow most likely. I'd really appreciate a resource on the __FlashStringHelper thing though, just so I can better document it and learn something.

AlexIII commented 2 years ago

INCTXT_EXTERN exists. That __FlashStringHelper* thing is interesting. Do you have some resources on that I can look into, I'm very curious. I think I can add an overload for INCTXT like the others to accommodate it.

Oh, Arduino is not particularly good at documenting it's inner workings. It's better to just see the source

https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/WString.h

class __FlashStringHelper;
#define F(string_literal) (reinterpret_cast<const __FlashStringHelper *>(PSTR(string_literal)))

As you can see here, __FlashStringHelper is just a dummy non-existent class, and with F() macro we cast all ROM strings from const char* to const __FlashStringHelper *.

Then, most of the functions that work with string has two overloads, for example String class constructor:

    String(const char *cstr);
    String(const __FlashStringHelper *str);

These both will construct String object, but will use different implementations to read the actual string data. First one will read from RAM, and the other will read from ROM.

AlexIII commented 2 years ago

The actual value of 1 doesn't particularly matter, it can be any byte value other than 0

Thanks for the explanation. Of course my first thought was "why don't we just change 1 to 0 and have ourselves a null terminator".

INCTXT_EXTERN exists

Sorry, missed that

graphitemaster commented 2 years ago

It should also be noted that even if we changed that 1 to a 0, the size would be incorrect since it doesn't count it.

AlexIII commented 2 years ago

@graphitemaster was that sufficient amount of info on __FlashStringHelper*?

AlexIII commented 2 years ago

@graphitemaster haven't heard from you for a while. There's still a problem #54 after the last update. Please take a look when you can.