Open m-carrasco opened 11 months ago
Thanks for reporting. I've made a fix attempt : https://github.com/GreycLab/CImg/commit/55da1f4dc8256aa8846036a45d5dbb2bab1466c2 Could you tell me if that looks good?
Thanks for the quick response!
I think a fix should take this into account:
offset
, header_size
and file_size
variables.
Those variables can come from the input file's content.offset
and header_size
can lead to the reported UB.
To fix this, we could check if they are in range with the file's size.I would propose a fix which performs these checks in this particular order:
file_size
so it matches the actual file size.offset
and header_size
are consistent with the now sanitized file_size
.From what I grasp, the fix is not sanitizing file_size
. Also, it attempts to sanitize header_size
using an unsanitized file_size
. Last, it is not checking offset
.
Here is my fix (on top of yours):
diff --git a/CImg.h b/CImg.h
index b5a934d..21db640 100644
--- a/CImg.h
+++ b/CImg.h
@@ -54419,6 +54419,11 @@ namespace cimg_library {
const ulongT fsiz = file?(ulongT)cimg_max_buf_size:(ulongT)cimg::fsize(filename);
std::FILE *const nfile = file?file:cimg::fopen(filename,"rb");
+
+ cimg::fseek(nfile,0,SEEK_END);
+ cimg_long filesystem_size=cimg::ftell(nfile);
+ std::rewind(nfile);
+
CImg<ucharT> header(54);
cimg::fread(header._data,54,nfile);
if (*header!='B' || header[1]!='M') {
@@ -54440,17 +54445,25 @@ namespace cimg_library {
nb_colors = header[0x2E] + (header[0x2F]<<8) + (header[0x30]<<16) + (header[0x31]<<24),
bpp = header[0x1C] + (header[0x1D]<<8);
+ if (file_size != filesystem_size){
+ throw CImgIOException(_cimg_instance
+ "load_bmp(): Invalid file_size %i specified in filename '%s', expected %i.",
+ cimg_instance,
+ file_size,filename?filename:"(FILE*)", filesystem_size);
+ }
+
if (header_size<0 || header_size>file_size)
throw CImgIOException(_cimg_instance
"load_bmp(): Invalid header size %d specified in filename '%s'.",
cimg_instance,
header_size,filename?filename:"(FILE*)");
- if (!file_size || file_size==offset) {
- cimg::fseek(nfile,0,SEEK_END);
- file_size = (int)cimg::ftell(nfile);
- cimg::fseek(nfile,54,SEEK_SET);
- }
+ if (offset<0 || offset>file_size)
+ throw CImgIOException(_cimg_instance
+ "load_bmp(): Invalid offset %i specified in filename '%s'.",
+ cimg_instance,
+ offset,filename?filename:"(FILE*)");
+
if (header_size>40) cimg::fseek(nfile,header_size - 40,SEEK_CUR);
const int
From what I tested, this fixes the UB and should also fix the first reported discrepancy with identify
.
The second discrepancy with ImageMagick seems to be about the image pixels' data. ImageMagick ensures that the total number of pixels read is consistent with the width and height.
I could also try to provide a patch if you like.
Best, Manuel.
Thanks. Here is my proposal then :
https://github.com/GreycLab/CImg/commit/009f0432fd32655f9399f2ec2fd5380c44e49a8e
Variable fsize
actually already estimates the file size, with the std::tell()
procedure.
Thanks!
It looks good to me for the case in which a file path is provided instead of a FILE*
.
The issue is with fsiz
is that it is computed as const ulongT fsiz = file?(ulongT)cimg_max_buf_size:(ulongT)cimg::fsize(filename);
.
If a file pointer is provided (file
is not NULL
), you could still have issues similar to those reported here.
fsiz
is set to cimg_max_buf_size
, which could be higher than the actual size, thus allowing incorrect values for offset
and header_size
again. cimg_max_buf_size
seems high enough to allow the UB to happen again.
Best, Manuel.
Indeed, so https://github.com/GreycLab/CImg/commit/703ffcaf470d426e333f8349760957119590d38a should help.
Yes, it looks good to me in regard to the UB :+1:
A minor comment would be that the new fsize is not restoring the original file position. In our case, this is not an issue but it may be if used elsewhere.
I would save the position at the beginning long originalPos = std::ftell(file);
and restore before returning std::fseek(file, originalPos, SEEK_SET);
Best, Manuel.
Indeed, thanks ! I've made the changes.
Regards,
David.
Hi CImg :wave:
We found an undefined behaviour bug in CImg triggered when parsing a BMP file. The BMP is correct according to CImg; no exceptions or crashes when parsed without sanitisers. However, ImageMagick's
identify
flags the BMP as incorrect.After debugging, we understand this correctness discrepancy is triggering the undefined behaviour.
Bug
CImg.h:54462:39: runtime error: signed integer overflow: 1048616 - -2147483608 cannot be represented in type 'int'
The UB happens in function
CImg<T>& _load_bmp(std::FILE *const file, const char *const filename)
and the line is:const int xoffset = offset - 14 - header_size - 4*nb_colors;
offset
has a value of 1048630 andheader_size
of -2147483608 while file size is 238.Possible fix
offset
andheader_size
could be compared tofile_size
to check that they are within range.If this sounds reasonable, I can provide a PR for it.
How to reproduce:
testcase.zip (Github does not allow uploading BMP files)
This has been tested with clang++-14.
Discrepancy with
identify
identify -verbose testcase.bmp
reportsWe have discovered additional BMP files triggering discrepancies with
identify
, like above, meaning that CImg may wrongly consider them correct. Would you find it useful if we report them?Thanks for sharing this project!
Best, Manuel.