starpu-runtime / starpu

This is a mirror of https://gitlab.inria.fr/starpu/starpu where our development happens, but contributions are welcome here too!
https://starpu.gitlabpages.inria.fr/
GNU Lesser General Public License v2.1
58 stars 13 forks source link

Unexpected behavior when executing `fmatrix_pick_variable` #18

Closed weslleyspereira closed 1 year ago

weslleyspereira commented 1 year ago

I obtain the following output when running fmatrix_pick_variable:

$ ./build_install/lib/starpu/examples/fmatrix_pick_variable 
Invalid MIT-MAGIC-COOKIE-1 key[starpu][weslleyp-XPS-15-9510][initialize_lws_policy] Warning: you are running the default lws scheduler, which is not a very smart scheduler, while the system has GPUs or several memory nodes. Make sure to read the StarPU documentation about adding performance models in order to be able to use the dmda or dmdas scheduler instead.
IN Matrix: 
matrix=0x7efe5c600000 nx=10 ny=21 ld=10
   0    1    2    3    4    5    6    7    8    9 
  10   11   12   13   14   15   16   17   18   19 
  20   21   22   23   24   25   26   27   28   29 
  30   31   32   33   34   35   36   37   38   39 
  40   41   42   43   44   45   46   47   48   49 
  50   51   52   53   54   55   56   57   58   59 
  60   61   62   63   64   65   66   67   68   69 
  70   71   72   73   74   75   76   77   78   79 
  80   81   82   83   84   85   86   87   88   89 
  90   91   92   93   94   95   96   97   98   99 
 100  101  102  103  104  105  106  107  108  109 
 110  111  112  113  114  115  116  117  118  119 
 120  121  122  123  124  125  126  127  128  129 
 130  131  132  133  134  135  136  137  138  139 
 140  141  142  143  144  145  146  147  148  149 
 150  151  152  153  154  155  156  157  158  159 
 160  161  162  163  164  165  166  167  168  169 
 170  171  172  173  174  175  176  177  178  179 
 180  181  182  183  184  185  186  187  188  189 
 190  191  192  193  194  195  196  197  198  199 
 200  201  202  203  204  205  206  207  208  209 

sub Matrix: 
matrix=0x7efe5c600000 nx=5 ny=7 ld=10
   0    1    2    3    4 
  10   11   12   13   14 
  20   21   22   23   24 
  30   31   32   33   34 
  40   41   42   43   44 
  50   51   52   53   54 
  60   61   62   63   64 

Sub Variable:
    5 
Dealing with sub-variable
OUT Variable:
   60 
sub Matrix: 
matrix=0x7efe5c600000 nx=5 ny=7 ld=10
   0    1    2    3    4 
  10   11   12   13   14 
  20   21   22   23   24 
  30   31   32   33   34 
  40   41   42   43   44 
  50   51   52   53   54 
  60   61   62   63   64 

Sub Variable:
   17 
Dealing with sub-variable
OUT Variable:
   17 
sub Matrix: 
matrix=0x7efe5c600230 nx=5 ny=7 ld=10
 140  141  142  143  144 
 150  151  152  153  154 
 160  161  162  163  164 
 170  171  172  173  174 
 180  181  182  183  184 
 190  191  192  193  194 
 200  201  202  203  204 

Sub Variable:
  159 
Dealing with sub-variable
OUT Variable:
 1908 
sub Matrix: 
matrix=0x7efe5c600230 nx=5 ny=7 ld=10
 140  141  142  143  144 
 150  151  152  153  154 
 160  161  162  163  164 
 170  171  172  173  174 
 180  181  182  183  184 
 190  191  192  193  194 
 200  201  202  203  204 

Sub Variable:
  145 
Dealing with sub-variable
OUT Variable:
 1740 
sub Matrix: 
matrix=0x7efe5c60012c nx=5 ny=7 ld=10
  75   76   77   78   79 
  85   86   87   88   89 
  95   96   97   98   99 
 105  106  107  108  109 
 115  116  117  118  119 
 125  126  127  128  129 
 135  136  137  138  139 

Sub Variable:
   94 
Dealing with sub-variable
OUT Variable:
 1128 
sub Matrix: 
matrix=0x7efe5c600230 nx=5 ny=7 ld=10
 140  141  142  143  144 
 150  151  152  153  154 
 160  161  162  163  164 
 170  171  172  173  174 
 180  181  182  183  184 
 190  191  192  193  194 
 200  201  202  203  204 

Sub Variable:
  142 
Dealing with sub-variable
OUT Variable:
 1704 
sub Matrix: 
matrix=0x7efe5c600014 nx=5 ny=7 ld=10
  60    6    7    8    9 
  15   16   17   18   19 
  25   26   27   28   29 
  35   36   37   38   39 
  45   46   47   48   49 
  55   56   57   58   59 
  65   66   67   68   69 

Sub Variable:
   34 
Dealing with sub-variable
OUT Variable:
  408 
sub Matrix: 
matrix=0x7efe5c600000 nx=5 ny=7 ld=10
   0    1    2    3    4 
  10   11   12   13   14 
  20   21   22   23   24 
  30   31   32   33  408 
  40   41   42   43   44 
  50   51   52   53   54 
  60   61   62   63   64 

Sub Variable:
   21 
Dealing with sub-variable
OUT Variable:
  252 
sub Matrix: 
matrix=0x7efe5c600014 nx=5 ny=7 ld=10
  60    6    7    8    9 
  15   16   17   18   19 
  25   26   27   28   29 
  35   36   37   38   39 
  45   46   47   48   49 
  55   56   57   58   59 
  65   66   67   68   69 

Sub Variable:
    8 
Dealing with sub-variable
OUT Variable:
   96 
sub Matrix: 
matrix=0x7efe5c600014 nx=5 ny=7 ld=10
  60    6    7   96    9 
  15   16   17   18   19 
  25   26   27   28   29 
  35   36   37   38   39 
  45   46   47   48   49 
  55   56   57   58   59 
  65   66   67   68   69 

Sub Variable:
   22 
Dealing with sub-variable
OUT Variable:
  264 
sub Matrix: 
matrix=0x7efe5c60012c nx=5 ny=7 ld=10
  75   76   77   78   79 
  85   86   87   88   89 
  95   96   97   98   99 
 105  106  107  108  109 
 115  116  117  118  119 
 125  126  127  128  129 
 135  136  137  138  139 

Sub Variable:
   89 
Dealing with sub-variable
OUT Variable:
   89 
OUT Matrix: 
matrix=0x7efe5c600000 nx=10 ny=21 ld=10
   0    1    2    3    4   60    6    7   96    9 
  10   11   12   13   14   15   16   17   18   19 
  20  252  264   23   24   25   26   27   28   29 
  30   31   32   33  408   35   36   37   38   39 
  40   41   42   43   44   45   46   47   48   49 
  50   51   52   53   54   55   56   57   58   59 
  60   61   62   63   64   65   66   67   68   69 
  70   71   72   73   74   75   76   77   78   79 
  80   81   82   83   84   85   86   87   88   89 
  90   91   92   93 1128   95   96   97   98   99 
 100  101  102  103  104  105  106  107  108  109 
 110  111  112  113  114  115  116  117  118  119 
 120  121  122  123  124  125  126  127  128  129 
 130  131  132  133  134  135  136  137  138  139 
 140  141 1704  143  144 1740  146  147  148  149 
 150  151  152  153  154  155  156  157  158 1908 
 160  161  162  163  164  165  166  167  168  169 
 170  171  172  173  174  175  176  177  178  179 
 180  181  182  183  184  185  186  187  188  189 
 190  191  192  193  194  195  196  197  198  199 
 200  201  202  203  204  205  206  207  208  209

As one can see, the sub variables are not taken from the respective submatrix. For instance,

sub Matrix: 
matrix=0x7efe5c600000 nx=5 ny=7 ld=10
   0    1    2    3    4 
  10   11   12   13   14 
  20   21   22   23   24 
  30   31   32   33   34 
  40   41   42   43   44 
  50   51   52   53   54 
  60   61   62   63   64 

Sub Variable:
    5 
Dealing with sub-variable
OUT Variable:
   60 

Actually, the value 5 that is in another submatrix is updated. See:

sub Matrix: 
matrix=0x7efe5c600014 nx=5 ny=7 ld=10
  60    6    7    8    9 
  15   16   17   18   19 
  25   26   27   28   29 
  35   36   37   38   39 
  45   46   47   48   49 
  55   56   57   58   59 
  65   66   67   68   69 

Therefore, I suspect that there is an unexpected behavior here. Could someone confirm that? Since I started looking at StarPU code this month, I don't know how to fix this issue.

My configurations

starpu$ readelf -d  build_install/lib/libstarpu-1.4.so

Dynamic section at offset 0x193b28 contains 37 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libOpenCL.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.12]
 0x0000000000000001 (NEEDED)             Shared library: [libcublas.so.12]
 0x0000000000000001 (NEEDED)             Shared library: [libcusparse.so.12]
 0x0000000000000001 (NEEDED)             Shared library: [libcusolver.so.11]
 0x0000000000000001 (NEEDED)             Shared library: [libnvidia-ml.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libhwloc.so.15]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x000000000000000e (SONAME)             Library soname: [libstarpu-1.4.so.1]
 0x000000000000000c (INIT)               0x23000
 0x000000000000000d (FINI)               0x132130
 0x0000000000000019 (INIT_ARRAY)         0x194190
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x194198
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x2f0
 0x0000000000000005 (STRTAB)             0xc7d8
 0x0000000000000006 (SYMTAB)             0x2c68
 0x000000000000000a (STRSZ)              45299 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000003 (PLTGOT)             0x195000
 0x0000000000000002 (PLTRELSZ)           20280 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x1d920
 0x0000000000000007 (RELA)               0x187c0
 0x0000000000000008 (RELASZ)             20832 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x185c0
 0x000000006fffffff (VERNEEDNUM)         11
 0x000000006ffffff0 (VERSYM)             0x178cc
 0x000000006ffffff9 (RELACOUNT)          641
 0x0000000000000000 (NULL)               0x0

Please, let me know if you need more information. Thanks!

sthibaul commented 1 year ago

the filter was indeed using nx instead of ld, this should now be fixed in the gitlab master & 1.4 https://gitlab.inria.fr/starpu/starpu.git

Thanks for the report!