spacetelescope / spherical_geometry

A Python package for handling spherical polygons that represent arbitrary regions of the sky
http://spherical-geometry.readthedocs.io/
61 stars 31 forks source link

Fix memcmp invalid read #266

Closed jhunkeler closed 3 months ago

jhunkeler commented 4 months ago

Merging this PR will break the following:

FAILED spherical_geometry/tests/test_basic.py::test_overlap - spherical_geometry.polygon.MalformedPolygonError: disjoint: find all intersections
FAILED spherical_geometry/tests/test_basic.py::test_great_circle_arc_length - assert 0.0 == 180.0
FAILED spherical_geometry/tests/test_basic.py::test_area - spherical_geometry.polygon.MalformedPolygonError: disjoint: find all intersections
FAILED spherical_geometry/tests/test_union.py::test_inside_point - Failed: Timeout >30.0s

test_inside_point enters an infinite loop, so a timeout was introduced to prevent hanging CI jobs for extended periods of time.

Ref: #262 #263 #265

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 80.29%. Comparing base (5de7fc7) to head (f75d4e0). Report is 1 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #266 +/- ## ========================================== - Coverage 81.28% 80.29% -1.00% ========================================== Files 5 5 Lines 1010 1010 ========================================== - Hits 821 811 -10 - Misses 189 199 +10 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

pllim commented 4 months ago

Huh... I wonder if this can finally be uncommented too.

https://github.com/spacetelescope/spherical_geometry/blob/d166e8c9564d066309a06a07943c22d448fc17a4/spherical_geometry/tests/test_union.py#L274-L276

mcara commented 4 months ago
  • memcmp(): Size cannot be larger than the source or destination object. (undefined behavior)

3 in memcmp(A, B, sizeof(qd) * 3) is there because A and B are vectors of length 3.

jhunkeler commented 4 months ago

I don't think I understand. qd is a struct containing a single 1D array of 4 doubles. * 3 will compare memory outside the bounds of the structure.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
    struct qd {
        double x[4];
    };
    struct qd A, B;
    printf("qd size: %zu\n", sizeof(A));
    printf("qd->x size: %zu\n", sizeof(A.x));
    printf("qd * 3 size: %zu\n", sizeof(A) * 3);
    puts("");
    printf("double size: %zu\n", sizeof(double));
    printf("double size * 4: %zu\n", sizeof(double) * 4);

    memset(&A, 0, sizeof(A));
    memset(&B, 0, sizeof(B));

    int result_times_three = memcmp(&A, &B, sizeof(A) * 3) == 0;
    int result_size_only = memcmp(&A, &B, sizeof(A)) == 0;

    printf("result_times_three: A == B? %s\n", result_times_three ? "YES" : "NO");
    printf("result_size_only:   A == B? %s\n", result_size_only ? "YES" : "NO");

    return 0;
}
$ gcc -fsanitize=address ./test.c                                                                                          
./test.c: In function ‘main’:
./test.c:20:30: warning: ‘memcmp’ specified bound 96 exceeds source size 32 [-Wstringop-overread]
   20 |     int result_times_three = memcmp(&A, &B, sizeof(A) * 3) == 0;
      |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./test.c:9:15: note: source object allocated here
    9 |     struct qd A, B;
      |               ^
$ ./a.out
qd size: 32
qd->x size: 32
qd * 3 size: 96

double size: 8
double size * 4: 32
=================================================================
==26180==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7134ce809040 at pc 0x7134d0e7cf8f bp 0x7ffd1489b0e0 sp 0x7ffd1489a888
READ of size 96 at 0x7134ce809040 thread T0
    #0 0x7134d0e7cf8e in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:813
    #1 0x7134d0e7d46c in memcmp /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:845
    #2 0x7134d0e7d46c in memcmp /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:840
    #3 0x558bef7d2343 in main (/home/jhunk/a.out+0x1343) (BuildId: bee151f40940ccc3bb78d2c4ba8e7609cd65a883)
    #4 0x7134d0c39c87  (/usr/lib/libc.so.6+0x25c87) (BuildId: 32a656aa5562eece8c59a585f5eacd6cf5e2307b)
    #5 0x7134d0c39d4b in __libc_start_main (/usr/lib/libc.so.6+0x25d4b) (BuildId: 32a656aa5562eece8c59a585f5eacd6cf5e2307b)
    #6 0x558bef7d20f4 in _start (/home/jhunk/a.out+0x10f4) (BuildId: bee151f40940ccc3bb78d2c4ba8e7609cd65a883)

Address 0x7134ce809040 is located in stack of thread T0 at offset 64 in frame
    #0 0x558bef7d21d8 in main (/home/jhunk/a.out+0x11d8) (BuildId: bee151f40940ccc3bb78d2c4ba8e7609cd65a883)

  This frame has 2 object(s):
    [32, 64) 'A' (line 9)
    [96, 128) 'B' (line 9) <== Memory access at offset 64 partially underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:813 in MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long)
Shadow bytes around the buggy address:
  0x7134ce808d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce808e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce808e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce808f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce808f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7134ce809000: f1 f1 f1 f1 00 00 00 00[f2]f2 f2 f2 00 00 00 00
  0x7134ce809080: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce809100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce809180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce809200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7134ce809280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==26180==ABORTING

and the output without the sanitizer enabled...

qd size: 32
qd->x size: 32
qd * 3 size: 96

double size: 8
double size * 4: 32
result_times_three: A == B? NO
result_size_only:   A == B? YES
mcara commented 4 months ago

@jhunkeler Let's just look at one part of the code to see how equal_qd is used. For example, equals_qd is called by length_qd in https://github.com/spacetelescope/spherical_geometry/blob/7e78fb4fa327acf8ba094b775f53655b92bdb10b/src/math_util.c#L240

and length_qd is called by DOUBLE_length in https://github.com/spacetelescope/spherical_geometry/blob/7e78fb4fa327acf8ba094b775f53655b92bdb10b/src/math_util.c#L637

At the top of DOUBLE_length, A and B are declared as vectors of qd-type: https://github.com/spacetelescope/spherical_geometry/blob/7e78fb4fa327acf8ba094b775f53655b92bdb10b/src/math_util.c#L616-L617

I didn't do an extensive analysis, but it looks like it is used in the same way in DOUBLE_intersects_point as well.

jhunkeler commented 4 months ago

You're right. I'm sorry! The magic 3 threw me off completely though.

Do you think adding a #define might help? The error that initially lead me to the equals_qd function left me baffled as to why it was comparing two qd structures like that.

#define MAXVEC 3

int some_function(...) {
    qd A[MAXVEC];
    qd B[MAXVEC];
    // ...
    if (equals_qd(A, B)) {
        // ...
    }
}

int equals_qd(const struct qd *A, const struct qd *B) {
    return memcmp(A, B, sizeof(qd) * MAXVEC) == 0;
}
mcara commented 4 months ago

We live in a 3D space 😄 . I think it's OK to leave this as is. I think let's minimize code modifications at this stage.

mcara commented 4 months ago

I would say, maybe SPACE_DIMENSIONS (or a shorter version) would be more informative than MAXVEC but the whole package/math is designed to work in 3D space and so "suggesting" that by changing a value in #define would make this package work with hyperspheres is not correct (the suggestion implied by define) - but it is not wrong technically. Also, MAX implies that somehow the number of coordinates could be less than 3. Let's just leave it as is.

pllim commented 4 months ago

What about this?

#define NDIM_SUPPORTED 3
pllim commented 3 months ago

So, are we going to close this without merge?