Closed fanoush closed 1 year ago
Thanks! That's interesting...
I think we need to make sure that actual error messages are still there as Espruino does report them, but if TF_LITE_STRIP_ERROR_STRINGS
strips just the log messages then that'd be great.
At one point, I did actually try to make sure that the error messages didn't include the full path (at least for the rest of Espruino)... IIRC the FILE value comes straight from the command-line, which in turn comes from $(ROOT)
and ROOT = $(CWD)
in the Makefile
If ROOT
could be set to a relative path then I think all the log strings would be relative too?
If
ROOT
could be set to a relative path then I think all the log strings would be relative too?
Yes, that's what I found too, __FILE__
comes from the way gcc is called so making it relative could help. Someone even suggested to pipe it via standard input to get rid of it completely
https://stackoverflow.com/questions/15688434/remove-embedded-source-filenames-from-shared-libraries/15688992#15688992 which is a bit too radical :-)
For C++ (which is tensorflow case) someone also found compile time static expression code to remove path and keep just filename - check https://stackoverflow.com/a/47020053 or see the last comment on the bottom. Or one could possibly call gcc and define extra define like gcc $file -DFILENAME=$(basename $file)
and use that instead of __FILE__
or even try to redefine __FILE__
itself in the source to that value if it exists.
I think we need to make sure that actual error messages are still there as Espruino does report them, but if TF_LITE_STRIP_ERROR_STRINGS strips just the log messages then that'd be great.
Yes, as I understand it it still returns some error code value but removes the message.
However the message can be sometimes helpful? I am not tensorflow expert but maybe when you make some random model on PC that uses some unsupported features it could be helpful to see the error message string what exactly is wrong in the model?
However the filename is not useful in release build IMO so I'd simply delete the %s:%d .. __FILE__, __LINE__,
part from the messages. That would generate smaller logging code when compared to making __FILE__
shorter.
EDIT: Also for GCC there is the __BASE_FILE__ https://stackoverflow.com/a/1706368 which does not contain the path(?)
Tested __BASE_FILE__
and it still has whole path, also tested this compile time stuff and the whole string is still kept inside binary even if not printed so both are not a solution.
Building code below with g++ works and prints just main.c:10
even when building like g++ -Os ``pwd``/main.c
however it still keeps whole string and __FILENAME__
just adds some constant offset to original string so it is printed from the middle.
#include <stdio.h>
constexpr const char* const BASENAME(const char* const str, const char* const lastslash = nullptr) {
return *str ? BASENAME(str + 1, ((*str == '/' || *str == '\\') ? str + 1 : (nullptr==lastslash?str:lastslash))) : (nullptr==lastslash?str:lastslash);
}
constexpr const char*__FILENAME__=BASENAME(__FILE__);
int main(int argc,char *argv[]){
printf("%s:%d %s",__FILENAME__,__LINE__,"Hello world!\n");
}
The compile time stuff probably still works as designed because even when build without any -O
the BASENAME method is not there and there is still just this precomputed constant offset.
there is actually another gcc solution targeted for reproducible builds described here https://reproducible-builds.org/docs/build-path/
so something like
gcc `pwd`/main.c -ffile-prefix-map=`pwd`=whatever
will change automatically one path prefix to another so __FILE__
becomes whatever/main.c
So in this tensorflow case -ffile-prefix-map=$(ROOT)=
could strip the root prefix
-ffile-prefix-map
actually sounds perfect - so -ffile-prefix-map=$(ROOT)=Espruino
added to build flags would probably sort things out really nicely?
Well, yes, depends how much you want to save (e.g. for Bangle 1).
BTW empty string works fine too so "=Espruino" will cut /home/runner/work/Espruino
from all those and something like Espruino/libs/tensorflow/tensorflow/lite/core/api/flatbuffer_conversions.cc
will still stay. By -ffile-prefix-map=$(ROOT)=
you will strip the second Espruino
too (and still keep the libs/tensorflow/tensorflow/lite/core/api
there).
So yes the reproducibility part is solved by this minimal solution. You still end with multiple longs strings inside Bangle 1 binary and some extra logging code bloat passing source file names and line numbers (which you maybe don't do for your own Espruino code for release builds?). Dont know how much room the Bangle1 build has nowadays.
Yes, good point... My only concern about changes to the tensorflow logging are any changes to core Tensorflow are going to make updating the firmware tricker... But we do have a folder of patches so I guess it can just go in there.
So would you be willing to do a PR for this? -ffile-prefix-map
and as you say removing file and line from https://github.com/espruino/Espruino/blob/master/libs/tensorflow/tensorflow/lite/c/common.h ?
I realise I didn't reply above, but yes - tensorflow error messages are extremely useful. It's pretty often you might have some tensorflow model that doesn't load for some reason and you need to know why (eg what operation it was using that isn't there)
WIll do a PR, so the changes for removing file name and line should be only as patch file or both applied in source and as a patch?
source and patch I think - thanks!
resolved by https://github.com/espruino/Espruino/pull/2270
When you check binaries for banglejs or banglejs2 for strings and grep for "home" it can be seen full path of some tensorflow sources is encoded many times there. Removing just the file paths can save 1KB and also makes the binary build more reproducible from same source.
It is caused by using FILE in calls to TF_LITE_KERNEL_LOG here https://github.com/espruino/Espruino/blob/master/libs/tensorflow/tensorflow/lite/kernels/op_macros.h#L51 and few times here https://github.com/espruino/Espruino/blob/master/libs/tensorflow/tensorflow/lite/c/common.h#L191
This could be ifdef-ed based on NDEBUG or RELEASE value to have the
%s:%d
..__FILE__, __LINE__,
part removed from TF_LITE_KERNEL_LOG calls while still having some kind of error message for RELEASE build.Or the logging can be removed completely by adding
-DTF_LITE_STRIP_ERROR_STRINGS
inmake/misc/tensorflow.make
as per https://github.com/espruino/Espruino/blob/master/libs/tensorflow/tensorflow/lite/c/common.h#L158this saves 10KB
I hit this while trying to make elf file that matches existing release build. Unfortunately even when putting same git commit of Espruino sources to same
/home/runner/work/Espruino/Espruino
path and provisioning with same compiler as seeing in travis build log (e.g. here https://github.com/espruino/Espruino/actions/runs/2739143892/jobs/4292803960) I got same size binary but some bytes are different here and there. Not sure why, when I do make clean and do same make again I get binary which is exactly the same.