inaos / iron-array

2 stars 0 forks source link

Multithreading issues preventing views to go in parallel #586

Open FrancescAlted opened 2 years ago

FrancescAlted commented 2 years ago

A recent optimization for activating type views to go in parallel (https://github.com/inaos/iron-array/commit/6d2964a0ed9c690428718367b2590e7abeaadf9c) had to be disabled (81a8400) because, even though tests are passing, helgrind is issuing pretty scaring race conditions like:

==213406== Possible data race during read of size 1 at 0x914621F by thread #265
==213406== Locks held: none
==213406==    at 0x74079D: blosclz_decompress (contribs/caterva/contribs/c-blosc2/blosc/blosclz.c:706)
==213406==    by 0x73BC7C: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1717)
==213406==    by 0x73AC2B: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2897)
==213406==    by 0x73CBE8: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2975)
==213406==    by 0x73CB07: blosc2_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2934)
==213406==    by 0x827AF5: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1880)
==213406==    by 0x8270DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==213406==    by 0x72E30A: type_view_postfilter (src/iarray_views.c:195)
==213406==    by 0x73C093: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==213406==    by 0x73AC2B: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2897)
==213406==    by 0x73CBE8: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2975)
==213406==    by 0x715D50: prefilter_func (src/iarray_expression.c:436)
==213406==  Address 0x914621f is 175 bytes inside a block of size 181 alloc'd
==213406==    at 0x667F893: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==213406==    by 0x825466: get_coffsets (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1106)
==213406==    by 0x827ACF: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1873)
==213406==    by 0x8270DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==213406==    by 0x72E30A: type_view_postfilter (src/iarray_views.c:195)
==213406==    by 0x73C093: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==213406==    by 0x73AC2B: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2897)
==213406==    by 0x73CBE8: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2975)
==213406==    by 0x715D50: prefilter_func (src/iarray_expression.c:436)
==213406==    by 0x736DE6: pipeline_forward (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:856)
==213406==    by 0x73EC86: blosc_c (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1008)
==213406==    by 0x73F994: t_blosc_do_job (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:0)
==213406==  Block was alloc'd by thread #263
==213406==
==213406== ----------------------------------------------------------------

(and tons of others)

These should be addressed before we can finally unleash all the performance out of views. So far, we will use them in pure single-thread environments.

FrancescAlted commented 2 years ago

Besides not being able to use views in expressions, it can be seen that activating multithreading (e.g. commenting this line out: https://github.com/inaos/iron-array/blob/develop/src/iarray_views.c#L573), can lead to run conditions in other situations, like simple slicing, as the helgrind tool is showing:

$ valgrind --tool=helgrind ./tests slice_type:3_f_ll_v
<skip>
==1261230== ----------------------------------------------------------------
==1261230==
==1261230== Possible data race during read of size 8 at 0x8745C08 by thread #53
==1261230== Locks held: none
==1261230==    at 0x805834: get_coffsets (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1035)
==1261230==    by 0x807FCF: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1873)
==1261230==    by 0x8075DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==1261230==    by 0x704466: slice_view_postfilter (src/iarray_views.c:237)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7109DB: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2899)
==1261230==    by 0x712978: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2977)
==1261230==    by 0x7041DE: type_view_postfilter (src/iarray_views.c:211)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7157A5: t_blosc_do_job (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:3107)
==1261230==    by 0x712DF8: t_blosc (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:3192)
==1261230==    by 0x5E2DB1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==1261230==
==1261230== This conflicts with a previous write of size 8 by thread #54
==1261230== Locks held: none
==1261230==    at 0x805AD1: get_coffsets (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1123)
==1261230==    by 0x807FCF: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1873)
==1261230==    by 0x8075DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==1261230==    by 0x704466: slice_view_postfilter (src/iarray_views.c:237)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7109DB: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2899)
==1261230==    by 0x712978: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2977)
==1261230==    by 0x7041DE: type_view_postfilter (src/iarray_views.c:211)
==1261230==  Address 0x8745c08 is 24 bytes inside a block of size 64 alloc'd
==1261230==    at 0x5E29E39: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==1261230==    by 0x803656: frame_new (contribs/caterva/contribs/c-blosc2/blosc/frame.c:44)
==1261230==    by 0x74781E: blosc2_schunk_new (contribs/caterva/contribs/c-blosc2/blosc/schunk.c:175)
==1261230==    by 0x7D474B: caterva_blosc_array_new (contribs/caterva/caterva/caterva.c:195)
==1261230==    by 0x7D4C90: caterva_empty (contribs/caterva/caterva/caterva.c:267)
==1261230==    by 0x7D5D94: caterva_from_buffer (contribs/caterva/caterva/caterva.c:432)
==1261230==    by 0x6D6894: iarray_from_buffer (src/iarray_constructor.c:255)
==1261230==    by 0x6C17C3: execute_iarray_slice_type (tests/test_slice_type.c:62)
==1261230==    by 0x6BFFE7: __ina_test_slice_type_3_f_ll_v_run (tests/test_slice_type.c:225)
==1261230==    by 0x866537: ina_test_run (test.c:689)
==1261230==    by 0x84B30B2: (below main) (libc-start.c:308)
==1261230==  Block was alloc'd by thread #1
<skip>
==1261230== Possible data race during read of size 8 at 0x876E380 by thread #53
==1261230== Locks held: none
==1261230==    at 0x70CA26: read_chunk_header (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:705)
==1261230==    by 0x712924: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2957)
==1261230==    by 0x712897: blosc2_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2936)
==1261230==    by 0x807FF5: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1880)
==1261230==    by 0x8075DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==1261230==    by 0x704466: slice_view_postfilter (src/iarray_views.c:237)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7109DB: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2899)
==1261230==    by 0x712978: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2977)
==1261230==    by 0x7041DE: type_view_postfilter (src/iarray_views.c:211)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7157A5: t_blosc_do_job (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:3107)
==1261230==  Address 0x876e380 is 16 bytes inside a block of size 128 alloc'd
==1261230==    at 0x5E27893: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==1261230==    by 0x805966: get_coffsets (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1106)
==1261230==    by 0x807FCF: get_coffset (contribs/caterva/contribs/c-blosc2/blosc/frame.c:1873)
==1261230==    by 0x8075DC: frame_get_lazychunk (contribs/caterva/contribs/c-blosc2/blosc/frame.c:2094)
==1261230==    by 0x704466: slice_view_postfilter (src/iarray_views.c:237)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7109DB: _blosc_getitem (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2899)
==1261230==    by 0x712978: blosc2_getitem_ctx (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:2977)
==1261230==    by 0x7041DE: type_view_postfilter (src/iarray_views.c:211)
==1261230==    by 0x711E43: blosc_d (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:1610)
==1261230==    by 0x7157A5: t_blosc_do_job (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:3107)
==1261230==    by 0x712DF8: t_blosc (contribs/caterva/contribs/c-blosc2/blosc/blosc2.c:3192)
==1261230==  Block was alloc'd by thread #54
=
<skip>

Ideally, we should provide a way for being able to call postfilters in parallel without these issues. This can be a major task, but fixing that would be of great benefit to us.

FrancescAlted commented 2 years ago

Even with PR #590 , I can still reproduce the freeze on my M1 MacBook Air (but only in that box!):

$ python -m pytest -v
<snip>
iarray/tests/test_reduce.py::test_red_type_view[test_reduce.iarr-False-sum-shape0-chunks0-blocks0-0-float64-uint64] PASSED  [ 73%]
iarray/tests/test_reduce.py::test_red_type_view[test_reduce.iarr-False-sum-shape1-chunks1-blocks1-axis1-int64-float64] ^C⏎
/Users/faltet/miniconda3/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Although it takes a while to freeze (about 5min), this is reproducible and always freezes in the same place.

martaiborra commented 2 years ago

Since f55390e35bc977a59bdd4e48c81aa54763fc2ab0 helgrind does not complain in the main view tests.