Open flexoron opened 1 month ago
Fixed the NAN printing issue. I'll look into the other one... it requires GSL v2.8 right?
Yes, you don't have to install it, just make:
https://www.gnu.org/software/gsl/
$ tar xf gsl-release-2-8.tar.gz
$ cd gsl-release-2-8
$ ./autogen.sh
$ ./configure
$ make
$ export LD_LIBRARY_PATH=$HOME/tools/gsl/gsl-release-2-8/.libs:$HOME/tools/gsl/gsl-release-2-8/cblas/.libs
$
$ gcc -I$HOME/tools/gsl/gsl-release-2-8 x.c -L$HOME/tools/gsl/gsl-release-2-8/.libs -L$HOME/tools/gsl/gsl-release-2-8/cblas/.libs -lgsl -lgslcblas -lm
$ ldd a.out # check libs
$ tpl x.pl
?-
# check libs
$ ps ax|grep tpl
72213 ....
$ lsof -p 72213 | grep gsl
Thanks, fixed my problems there.
On Fri, Jul 26, 2024 at 8:54 AM flexoron @.***> wrote:
Yes, you don't have to install it, just make:
https://www.gnu.org/software/gsl/ $ tar xf gsl-release-2-8.tar.gz $ cd gsl-release-2-8 $ ./autogen.sh $ ./configure $ make $ export LD_LIBRARY_PATH=$HOME/tools/gsl/gsl-release-2-8/.libs:$HOME/tools/gsl/gsl-release-2-8/cblas/.libs $ $ $ gcc -I$HOME/tools/gsl/gsl-release-2-8 x.c -L$HOME/tools/gsl/gsl-release-2-8/.libs -L$HOME/tools/gsl/gsl-release-2-8/cblas/.libs -lgsl -lgslcblas -lm $ ldd a.out # check libs $ tpl x.pl ?-
check libs
$ ps ax|grep tpl 72213 .... $ lsof -p 72213 | grep gsl
— Reply to this email directly, view it on GitHub https://github.com/trealla-prolog/trealla/issues/573#issuecomment-2251531872, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNKSEV5OEQIKYCGHJSCQTLZOF6Z5AVCNFSM6AAAAABLOIBUKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRGUZTCOBXGI . You are receiving this because you commented.Message ID: @.***>
Interesting, if in lu.c / gsl_linalg_LU_decomp() I change:
gsl_vector_uint * ipiv = gsl_vector_uint_alloc(minMN);
to
gsl_vector_uint * ipiv = gsl_vector_uint_calloc(minMN);
it works. Seems a bug in GSL 2.8 to me, depending on uninitialized values.
Not sure, the C-program does not crash and for example:
$ tpl x.pl
?- between(2,128,S), tst(S,S), fail. % does not crash
?- halt.
$ tpl x.pl
?- between(124,128,S), tst(S,S), fail. % crash
FF pivi 14 i 14
FF pivi 15 i 15
FF pivi 368989664 i 16
Segmentation fault (core dumped)
The C program is simple and is allocating virgin zeroed memory from the OS. In Trealla a lot mallocs & frees have been done.
On Fri, Jul 26, 2024 at 11:31 AM flexoron @.***> wrote:
Not sure, the C-program does not crash and for example:
$ tpl x.pl ?- between(2,128,S), tst(S,S), fail. % does not crash ?- halt.
$ tpl x.pl ?- between(124,128,S), tst(S,S), fail. % crash FF pivi 14 i 14 FF pivi 15 i 15 FF pivi 368989664 i 16 Segmentation fault (core dumped)
— Reply to this email directly, view it on GitHub https://github.com/trealla-prolog/trealla/issues/573#issuecomment-2251786939, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNKSEXYKFQPSLFSKUWAAY3ZOGRI3AVCNFSM6AAAAABLOIBUKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRG44DMOJTHE . You are receiving this because you commented.Message ID: @.***>
tpl with GSL v2.7.1 does not crash, too: v2.7.1: gsl_vector_uint * ipiv = gsl_vector_uint_alloc(minMN);
You don't know what else changed that accesses that memory, maybe nothing but ...
On Fri, Jul 26, 2024 at 11:37 AM flexoron @.***> wrote:
tpl with GSL v2.7.1 does not crash, too: v2.7.1: gsl_vector_uint * ipiv = gsl_vector_uint_alloc(minMN);
— Reply to this email directly, view it on GitHub https://github.com/trealla-prolog/trealla/issues/573#issuecomment-2251791593, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNKSEQS5TOG34EODYPAKJDZOGR4HAVCNFSM6AAAAABLOIBUKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRG44TCNJZGM . You are receiving this because you commented.Message ID: @.***>
Added this to that function
gsl_vector_uint * ipiv = gsl_vector_uint_alloc(minMN);
+ for (unsigned i = 0; i < ipiv->size; i++)
+ printf("*** [%u] %u\n", (unsigned)i, (unsigned)ipiv->data[i]);
and it shows all zero elements for x.c & some non-zero ones for x.pl
GSL 2.8 is buggy. It was only released a few months ago.
Yes but this way: non-zero,zero,no-zero,zero,non-zero,zero, or 32,0,64,0,55,0,and so on Hmm what now. v2.7.1 is buggy too: result '-nan' is not correct. gsl_permutation_calloc/2 does not change it, too
printf("FF pivi %u i %lu\n",pivi,i); if (p->data[pivi] != p->data[i]) it is the index value of pivi data[pivi] pivi 456910304 = data[456910304]
LU_decomp_L3() & apply_pivots() can't work if the vector ipiv wasn't zeroed properly, so pivi can be incorrect.
I would suggest submitting a bug report. I did one for GCC about 20 years ago, took about 6 months to get acted on.
: gsl_matrix_calloc(Rows, Cols, M), gsl_permutation_alloc(Rows, P), write(M/P),nl, : 968671312 / 969277808 FF pivi 969168480 i 16 Segmentation fault (core dumped)
You see this? The values of M and P and this pivi value are always too close. I don't think that this is by chance.
The memory can contain anything. If you dump the newly alloc'd vector it contains rubbish, which is useless as it's used without ever being initialized.
On Fri, Jul 26, 2024 at 12:14 PM flexoron @.***> wrote:
: gsl_matrix_calloc(Rows, Cols, M), gsl_permutation_alloc(Rows, P), write(M/P),nl, : 968671312 / 969277808 FF pivi 969168480 i 16 Segmentation fault (core dumped)
You see this? The values of M and P and this pivi value are always too close. I don't think that this is by chance.
— Reply to this email directly, view it on GitHub https://github.com/trealla-prolog/trealla/issues/573#issuecomment-2251836514, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNKSEQ6BIZ73SRTQ37EIPDZOGWHRAVCNFSM6AAAAABLOIBUKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJRHAZTMNJRGQ . You are receiving this because you commented.Message ID: @.***>
I'm talking about this...
gsl_vector_uint * ipiv = gsl_vector_uint_alloc(minMN);
+ for (unsigned i = 0; i < ipiv->size; i++)
+ printf("*** [%u] %u\n", (unsigned)i, (unsigned)ipiv->data[i]);
if the values are non-zero (which they are at some points) you are bound to get errors when they are used as an array index.
?- tst(40,40). 21852240/21852304 FF pivi 22458688 i 24 Segmentation fault (core dumped)
Again pivi is nearby M and P
?- tst(127,127). 362111056/363287184 FF pivi 362608224 i 16 Segmentation fault (core dumped)
I don't know what you want me to do... gsl_vector_uint_alloc returns unzeroed data (by design) when the following code expected zeroed data.
Conclusion: Both versions are buggy and your finding might be a solution to work with 2.8 gsl_vector_uint * ipiv = gsl_vector_uint_calloc(minMN);
URL: https://savannah.gnu.org/bugs/?66026
Summary: gsl_linalg_LU_decomp using uninitialized memory
Group: GNU Scientific Library
Submitter: infradig
Submitted: Fri 26 Jul 2024 02:56:44 AM UTC
Category: Runtime error
Severity: 3 - Normal
Operating System: Ubuntu
Status: None
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Release: 2.8
Here we are
$ cat x.c
:
gsl_matrix *mat = gsl_matrix_calloc(rows, cols);
gsl_permutation *perm = gsl_permutation_alloc(rows);
{
int i, j;
gsl_matrix *m = gsl_matrix_alloc(rows, cols);
for (i = 0; i < rows; i++)
for (j = 0; j < cols; j++)
gsl_matrix_set(m, i, j, 2000000000);
gsl_matrix_free(m);
}
stat = gsl_linalg_LU_decomp(mat, perm, &sig);
det = gsl_linalg_LU_det(mat, sig);
:
$ gcc ...
$ ./a.out 127 127
v2.8 127x127
:
FF pivi 15 i 15
FF pivi 0 i 16
FF pivi 1105055077 i 17
Segmentation fault (core dumped)
$
fyi
More info for if those gsl guys start a discussion:
1.) ./a.out 128 128 or 256 256 or 512 512 or 1024 1024 does not crash,
so you don't always get the fault.
2.) The problem might be the status of a singular matrix:
status = LU_decomp_L3 (&AL.matrix, ipiv);
If not singular then status = 0 and all fields of vector ipiv are valid(have been processed).
if singular then status = number of fields which have been processed.
For example: singular 127x127 status = 16
means that this is wrong:
for (i = 0; i < minMN; ++i) { # the segfault area
Better is this:
size_t damage = minMN;
if (status) damage = status;
for (i = 0; i < damage; ++i) {
}
Even better (perhaps)
if (status) { ; /* singular, do nothing */
} else {
for (i = 0; i < minMN; ++i) {
:
}
gsl_vector_uint_free(ipiv);
return status;
Performance back: no calloc and in case of singular dismiss signum as valueless.
Buggy gsl 2.7 does not crash because it handles a singular matrix as if it is non-singular:
status = LU_decomp_L3 (&AL.matrix, ipiv);
status is always 0 (vector fields are 'filled' according to alloc 127 for example).
Buggy gsl 2.8 stops when it is clear that its a singular matrix and returns a value >= 1.
status = LU_decomp_L3 (&AL.matrix, ipiv);
status == 16 (for example), fields(-index) > 16 stay untouched hence the crash(see above).
I'd copy/paste this but maybe you should create an account an submit a comment.
On Sun, 28 July 2024, 01:04 flexoron, @.***> wrote:
fyi More info for if those gsl guys start a discussion:
1.) ./a.out 128 128 or 256 256 or 512 512 or 1024 1024 does not crash, so you don't always get the fault.
2.) The problem might be the status of a singular matrix:
status = LU_decomp_L3 (&AL.matrix, ipiv);
If not singular then status = 0 and all fields of vector ipiv are valid(have been processed).
if singular then status = number of fields which have been processed.
For example: singular 127x127 status = 16 means that this is wrong: for (i = 0; i < minMN; ++i) { # the segfault area
Better is this: if (!status) status = minMN; for (i = 0; i < status; ++i) {
Even better (perhaps) if (status) { ; / singular, do nothing / } else { for (i = 0; i < minMN; ++i) { : } gsl_vector_uint_free(ipiv); return status;
Performance back: no calloc and in case of singular dismiss signum as valueless.
— Reply to this email directly, view it on GitHub https://github.com/trealla-prolog/trealla/issues/573#issuecomment-2254173125, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNKSEXXYCHJTEBRGCDP7HDZOOZIJAVCNFSM6AAAAABLOIBUKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJUGE3TGMJSGU . You are receiving this because you modified the open/close state.Message ID: @.***>
x.pl
Instead of
gsl_matrix_calloc(Rows, Cols, M), % Fixed: -.0nan now prints -nan (see Fix float nan-value printing)
do
mat_random(M, Rows, Cols), % inf-value printing need a fix: -.0inf
?- tst(1200,1200).
-.0inf
?-