trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.19k stars 566 forks source link

Teuchos: Teuchos_LAPACK_wrappers.hpp inconsistent with lapack ref. implementation #11168

Closed waveman68 closed 9 months ago

waveman68 commented 1 year ago

Bug Report

@trilinos/\<teuchos>

Description

There are numerous inconsistencies between lapack ref. implementation and Teuchos_LAPACK_wrappers.hpp. In each case, there is a difference between lapack.h from OpenBLAS and the wrapper in terms of const missing for in only or const for in:out.

An example is below. I could correct some manually, only to move on to the next. The result is >1000 errors and compilation halted. Is there a better way to proceed?

The Trilinos build was generated with Spack on linux-ubuntu20.04-haswell using gcc-12.2.0 and the following spec:

spack spec trilinos@13.4.0%gcc@12.2.0+amesos+amesos2+anasazi+aztec+boost+epetra+epetraext+explicit_template_instantiation+fortran+hdf5+hypre+ifpack~ifpack2+kokkos+ml+mpi~muelu+mumps+sacado+shared+superlu-dist+tpetra+zoltan

Steps to Reproduce

  1. SHA1: czqyfvp7ppcqdbw6n4qixrqus5qhcast
  2. Configure script: See spack spec above
  3. Configure log: attached, spack-build-01-cmake-out.txt, spack-build-02-build-out.txt
  4. Build log: attached, SW-with-Trilinos-make.log
In file included from /home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/ml_common.h:56,
                 from /home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/ml_include.h:16,
                 from /home/samuel/Documents/repos/femaxx/src/femaxxdriver.cpp:23:
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/Teuchos_LAPACK_wrappers.hpp:113:37: error: conflicting declaration of C function 'void dtrtri_(const char*, const char*, const int*, const double*, const int*, int*)'
  113 | #define DTRTRI_F77  F77_BLAS_MANGLE(dtrtri,DTRTRI)
      |                                     ^~~~~~
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/ml_config.h:8:37: note: in definition of macro 'F77_BLAS_MANGLE'
    8 |  #define F77_BLAS_MANGLE(name,NAME) name ## _
      |                                     ^~~~
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/Teuchos_LAPACK_wrappers.hpp:401:13: note: in expansion of macro 'DTRTRI_F77'
  401 | void PREFIX DTRTRI_F77(Teuchos_fcd, Teuchos_fcd, const int* n, const double* a, const int* lda, int* info);
      |             ^~~~~~~~~~
In file included from /home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/lapack.h:11,
                 from /home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/lapacke.h:37,
                 from /home/samuel/Documents/repos/femaxx/src/colarray/vector.h:22,
                 from /home/samuel/Documents/repos/femaxx/src/colarray/matrix.h:21,
                 from /home/samuel/Documents/repos/femaxx/src/tetmesh/tet.h:18,
                 from /home/samuel/Documents/repos/femaxx/src/tetmesh/tetmesh.h:40,
                 from /home/samuel/Documents/repos/femaxx/src/tetmeshbuilder.h:21,
                 from /home/samuel/Documents/repos/femaxx/src/femaxxdriver.cpp:35:
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/lapack.h:13232:37: note: previous declaration 'void dtrtri_(const char*, const char*, const int*, double*, const int*, int*)'
13232 | #define LAPACK_dtrtri LAPACK_GLOBAL(dtrtri,DTRTRI)
      |                                     ^~~~~~
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/lapacke_mangling.h:12:39: note: in definition of macro 'LAPACK_GLOBAL'
   12 | #define LAPACK_GLOBAL(lcname,UCNAME)  lcname##_
      |                                       ^~~~~~
/home/samuel/spack/var/spack/environments/femaxx_ng/.spack-env/view/include/lapack.h:13233:6: note: in expansion of macro 'LAPACK_dtrtri'
13233 | void LAPACK_dtrtri(

An example of a correction:

DLASCL_F77, Line 371:

According to:

Netlib

a is an output and should not be const.

void PREFIX DLASCL_F77(Teuchos_fcd, const int* kl, const int* ku, const double* cfrom, const double* cto, const int* m, const int* n, const double* a, const int* lda, int* info);

//CHANGE
void PREFIX DLASCL_F77(Teuchos_fcd, const int* kl, const int* ku, const double* cfrom, const double* cto, const int* m, const int* n, double* a, const int* lda, int* info);

With this change, the error disappeared.

jhux2 commented 1 year ago

@trilinos/teuchos

bartlettroscoe commented 1 year ago

Why are not seeing this in other situations?

waveman68 commented 1 year ago

I am just trying to get an old code base working. Our problem requires complex numbers, but Teuchos does not support complex numbers, e.g. there is no ztrtrs, but there is a LAPACKE_ztrtrs.

The issue comes up when I import lapacke.h directly, which is fairly common for other applications. The Trilinos docs are not so straightforward for someone coming outside of the community and it's not clear to me how to work around this problem.

bartlettroscoe commented 1 year ago

The issue comes up when I import lapacke.h directly, which is fairly common for other applications.

@waveman68, is it possible to partition the code base to not have both the Teuchos_LAPACK_wrappers.hpp and lapacke.h included in the same source files? Otherwise, this sounds like this could be a large refactoring project that might impact other existing outside customers. Otherwise, we need to kick this up to the Trilinos leadership to determine how to handle this.

@ccober6, who is the Trilinos lead for the Teuchos package these days?

hkthorn commented 1 year ago

@waveman68 Thanks for reporting this issue. I see the inconsistencies in the LAPACK interface with respect to the reference implementations. However, Teuchos does support complex numbers and there is a wrapper for ZTRTRS in Teuchos_LAPACK_wrappers.hpp on line 577. Maybe there is something else that I'm missing???

waveman68 commented 1 year ago

@hkthorn For what it's worth, I find Teuchos_LAPACK.hpp for some and Teuchos_LAPACK_wrappers.hpp for other LAPACK routines extremely confusing. I get that you have constraints, but it makes life really complicated for people coming in from outside the community.

I didn't find any complex LAPACK routines (i.e., starting with z) in Teuchos_LAPACK.hpp, so I didn't look in Teuchos_LAPACK_wrappers.hpp.

Examples weren't illuminating in this regard either.

waveman68 commented 1 year ago

@hkthorn @bartlettroscoe These appears to be definitely missing: ZTGSEN & ZGGES in Techos_LAPACK_wrappers.hpp. It is in lapacke.h.

Any reason why?

hkthorn commented 1 year ago

@waveman68 The wrappers are not meant to be exhaustive, they have been implemented as requested by users. If those wrappers would help your work, we can integrate them.

waveman68 commented 1 year ago

@hkthorn

Please add them. I will close the issue.

Thanks

waveman68 commented 1 year ago

Closing with request to add ZTGSEN & ZGGES in Techos_LAPACK_wrappers.hpp

hkthorn commented 1 year ago

@waveman68 I'm going to reopen this issue because it has the most complete summary of what needs to be addressed in the Teuchos LAPACK wrappers.

hkthorn commented 1 year ago

This is also a wrapper that needs to be looked into, as reported by #11187 :

void PREFIX ZGGEV_F77(Teuchos_fcd, Teuchos_fcd, const int *n, std::complex<double> *A, const int *lda, std::complex<double> *B, const int *ldb, std::complex<double> *alpha, double *beta, double *vl, const int *ldvl, double *vr, const int *ldvr, double *work, const int *lwork, double *rwork, int *info);
hkthorn commented 1 year ago

In reviewing the provided log for errors, the following changes are needed in the LAPACK wrappers:

File: Teuchos_LAPACK.hpp/cpp _PTTRF: Argument "d" is changed to MagnitudeType _PTTRS: Argument "d" is changed to MagnitudeType _LATRS: Argument "A" is changed to const _GBCON: Argument "IPIV" is changed to const _PORFS: Argument "a" is changed to const _TRTRI: Argument "a" is changed to nonconst _ORMQR: Argument "A" is changed to const _UNMQR: Argument "A" is changed to const _TGEVC: Arguments "s" and "p" are changed to const _GEQR2, _LASWP, _ORM2R, _UNM2R, _GEQP3: Formatting changes for consistency

File: Teuchos_LAPACK_wrappers.hpp

DGEES: Add space in "sdim" argument [C/Z]PTTRF: Argument "d" is changed to MagnitudeType [C/Z]PTTRS: Argument "d" is changed to MagnitudeType ZGGEV: Arguments "beta", "vl", and "vr" are changed to std::complex _LASCL: Argument "a" is changed to non-const _LATRS: Argument "A" is changed to const _GBCON: Argument "IPIV" is changed to const _PORFS: Argument "a" is changed to const _TRTRI: Argument "a" is changed to nonconst _TGEVC: Arguments "s" and "p" are changed to const _ORMQR: Argument "A" is changed to const _UNMQR: Argument "A" is changed to const _GEQR2, _LASWP, _ORM2R, _UNM2R: Formatting changes for consistency

hkthorn commented 1 year ago

@etphipp FYI, there have been some identified issues in the Teuchos LAPACK wrappers that are used in Stokhos:

stokhos/src/sacado/kokkos/pce/Teuchos_LAPACK_UQ_PCE.hpp stokhos/src/sacado/kokkos/vector/Teuchos_LAPACK_MP_Vector.hpp

This issue will track the corrections to the LAPACK interface in Teuchos.

hkthorn commented 1 year ago

@dridzal @dpkouri FYI, there have been some identified issues in the Teuchos LAPACK wrappers that are used in ROL examples for _PTTRF/_PTTRS. The wrappers for these functions are incorrect for std::complex data types, which may not be of concern now, but might in the future. This issue will track the corrections to the LAPACk interface in Teuchos.

waveman68 commented 1 year ago

Glad to see this progressing.

hkthorn commented 1 year ago

@waveman68 I have addressed the errors in the wrappers from the log file you provided, some others that I found independently, and added the requested methods. Please let me know if this addresses your needs. Thanks!

waveman68 commented 1 year ago

@hkthorn Thanks! Where should I pull from? Master?

cgcgcg commented 1 year ago

@waveman68 As of right now, it looks like the PR has only made it to develop and not yet to master.

github-actions[bot] commented 10 months ago

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity. If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE label. If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE. If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.

waveman68 commented 10 months ago

What is the status? I think it was merged into master, but I'd like someone with more insight to check.

cgcgcg commented 10 months ago

It's definitely on master by now.