ecmwf-ifs / field_api

Apache License 2.0
3 stars 8 forks source link

Update buddy_alloc.h #48

Closed pmarguinaud closed 4 months ago

pmarguinaud commented 4 months ago

There is a bug in buddy_alloc.h; it cannot handle very large memory sections.

Increasing DEV_ALLOC_SIZE to 120Gb highlights the problem :

$ DEV_ALLOC_SIZE=120000000000 make test
...
The following tests FAILED:
          1 - main.x (Subprocess aborted)
          4 - reshuffle_lastdim.x (Subprocess aborted)
          6 - test_statistics.x (Subprocess aborted)
          7 - test_sizeof.x (Subprocess aborted)
          8 - test_bc.x (Subprocess aborted)
          9 - reshuffle.x (Subprocess aborted)
         10 - test_wrappernosynconfinal.x (Subprocess aborted)
         11 - test_field1d.x (Subprocess aborted)
         13 - async_host.x (Subprocess aborted)
         14 - cpu_to_gpu.x (Subprocess aborted)
         15 - cpu_to_gpu_delayed_init_value.x (Subprocess aborted)
         16 - cpu_to_gpu_init_value.x (Subprocess aborted)
         17 - delete_device_wrapper.x (Subprocess aborted)
         20 - final_wrapper_gpu.x (Subprocess aborted)
         23 - get_stats.x (Subprocess aborted)
         25 - get_view_get_device_data.x (Subprocess aborted)
         32 - init_owner_delayed_gpu.x (Subprocess aborted)
         37 - init_owner_init_debug_value_gpu.x (Subprocess aborted)
         38 - init_owner_init_delayed_debug_value_gpu.x (Subprocess aborted)
         39 - init_owner_init_delayed_value_gpu.x (Subprocess aborted)
         46 - init_wrapper_non_contiguous_multi.x (Subprocess aborted)
         47 - no_transfer_get_device.x (Subprocess aborted)
         52 - sync_device.x (Subprocess aborted)
         53 - sync_host.x (Subprocess aborted)
         54 - test_crc64.x (Subprocess aborted)
         55 - wrapper_modify_gpu.x (Subprocess aborted)
         56 - test_gang.x (Subprocess aborted)

Using the latest version of buddy_alloc.h solves the issue.

This problem has been discussed with the developper of buddy_alloc in this thread :

https://github.com/spaskalev/buddy_alloc/issues/105