Open MEI1993 opened 6 years ago
We currently don't test the 64-bit interface regularly. I just fixed a bug, perhaps this solve your issue: https://github.com/pghysels/STRUMPACK/commit/f2c7df90bcf950a750a0f44da9f2239fc0ac0d83
What do you mean with 'the matrix output'? The code computes an LU factorization of the sparse input matrix, but we do not expose the L and U factors to the user. We only provide an interface to solve a linear system with the factored matrix.
when l slove AX=b,I want get x,but the x is wrong.I just fixed testMMdoubleMPIDist64 bug.this x also wrong. When I use testMMdouble64 , The x is true.I thank that the way of geting x may wrong when I get x in testMMdoubleMPIDist64.
For testMMdoubleMPIDist64 (and testMMdoubleMPIDist) the sparse matrix A is distributed over the different processes. Each process gets a block of the rows. The vectors b and x should also be distributed over the processes with the same distribution (same number of rows per process). So x and b on each process only contain the local parts. This distribution is described by Adist->local_rows().
What kind of matrix are you using? Have you tried with the small matrix included with strumpack (pde900.mtx)?
I using matrix{ (1,1) 2、(2,2)2、(3,3)3、(4,4)4、(5,5)5}and The default b is 1 . but result is (1,1,1,1,1).this is a matrix of 5*5.I know that testMMdoubleMPIDist64 is distributed over the different process.So I only use one process.I don't know Whether this result is affected when only one process.Maybe I should see Adist->local_rows(). THank you very much
I am trying chang it.but result of result is also worry.can you tell me what is version of Mpi。I guess may be compiled environment’s problems。I complied the program in Tianhe.Looking forward to your reply。Thank you
#include <iostream>
#include "StrumpackSparseSolverMPIDist.hpp"
#include "sparse/CSRMatrix.hpp"
#include "sparse/CSRMatrixMPI.hpp"
using namespace strumpack;
int main(int argc, char* argv[]) {
int rank, P;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &P);
{
CSRMatrix<double,int64_t> A(5, 5);
int64_t* ptr = A.get_ptr();
int64_t* ind = A.get_ind();
double* val = A.get_val();
val[0] = 1; ind[0] = 0; ptr[0] = 0;
val[1] = 2; ind[1] = 1; ptr[1] = 1;
val[2] = 3; ind[2] = 2; ptr[2] = 2;
val[3] = 4; ind[3] = 3; ptr[3] = 3;
val[4] = 5; ind[4] = 4; ptr[4] = 4;
ptr[5] = 5;
if (!rank) A.print_dense("A");
// distribute the matrix A, the last argument should be true if A is
// only available at the root, but in this case, it is available on
// all ranks
CSRMatrixMPI<double,int64_t> Adist(&A, MPI_COMM_WORLD, false);
StrumpackSparseSolverMPIDist<double,int64_t> spss(MPI_COMM_WORLD);
spss.options().set_mc64job(0);
spss.options().set_from_command_line(argc, argv);
auto n_local = Adist.local_rows();
std::vector<double> b(n_local, 1.), x(n_local, 0.);
spss.set_distributed_csr_matrix
(Adist.local_rows(), Adist.get_ptr(), Adist.get_ind(),
Adist.get_val(), Adist.get_dist().data(),
Adist.has_symmetric_sparsity());
spss.solve(b.data(), x.data());
if (!rank) std::cout << "x = [" << std::flush;
for (int p=0; p<rank; p++) MPI_Barrier(MPI_COMM_WORLD);
for (auto xi : x) std::cout << xi << " " << std::flush;
for (int p=rank; p<P+1; p++) MPI_Barrier(MPI_COMM_WORLD);
if (!rank) std::cout << "];" << std::endl;
auto scaled_res = Adist.max_scaled_residual(x.data(), b.data());
if (!rank)
std::cout << "# COMPONENTWISE SCALED RESIDUAL = "
<< scaled_res << std::endl;
}
scalapack::Cblacs_exit(1);
MPI_Finalize();
return 0;
}
I get the following output: A = [1 0 0 0 0 ; 0 2 0 0 0 ; 0 0 3 0 0 ; 0 0 0 4 0 ; 0 0 0 0 5 ; ]; ... x = [1 0.5 0.333333 0.25 0.2 ]; COMPONENTWISE SCALED RESIDUAL = 0
This result of program is right in my compiled environment.Thank you.
I want to calculate the plural matrix. I try change the above program to calculate the matrix but failed. A = [1+1i 0 0 0 0 ; 0 2+2i 0 0 0 ; 0 0 3+3i 0 0 ; 0 0 0 4+4i 0 ; 0 0 0 0 5+5i ; ]; can you tell me how to change the above program to calculate A.Looking forward to your reply.Thank you
add
#include <complex>
change
double
to
std::complex<double>
. Change
val[0] = 1.;
to
val[0] = std::complex<double>(1., 1.);
etc..
I have been change it.Thank you
When I use this program to test the parallel efficiency ,I found when I fixed the numbers of cores. When I found that when the parallel efficiency was tested by the code when I fixed the auditing, when I increase number of thread, the time of runing program is grow.The time is shortest when I use one thread and n process(there is n cores).this means MPI excel MPI+OPENMP.Is this normal?
Yes, using only MPI can often be slightly faster than using MPI+OpenMP for the same total number of cores used. However, running with less MPI processes typically uses less memory.
Thank you for your reply.But l have a new problem.Can you continue to help me to solve the problem?When I use HSS compression,the Results isn't convergent.May the problem of revision by me is wrong.
I add spss.options().enable_HSS() in problem. For example:
StrumpackSparseSolverMPIDist<double,int64_t> spss(MPI_COMM_WORLD);
spss.options().set_mc64job(0);
spss.options().set_from_command_line(argc, argv);
spss.options().enable_HSS()
auto n_local = Adist.local_rows();
std::vector
This is what I try to modify by reading Users' Guide.May this is wrong.And This result is right when I don't use HSS compress.Can you tell me the right way of useing HSS comnpress.Or some matrix is't suitable for use this way. Thank you very much. Looking forward to your reply
Try to change the compression tolerance:
spss.options().HSS_options().set_rel_tol(1e-4);
or on the command line:
--hss_rel_tol 1e-4
Hi:
theses days,I try to write dietrubuted input interface.but There are some problem; I solve AX=b.
A = [1 0 0 0 0 ;
0 2 0 0 0 ;
0 0 3 0 0 ;
0 0 0 4 0 ;
1 0 0 0 5 ;
];
and b=1 . This result is rigth.there are error that is" cn746: task 0: Segmentation fault"
The interface may error. This is my program.Can you tell me rigth interface of distribute.Thank you very much.
int main(int argc, char argv[]) {
int rank, P,thread_level;
/----------------------------/
int nnnum;
int a=0,b=0;
//bool is_complex=true;
double ival,rval;
MPI_Init_thread(&argc, &argv,MPI_THREAD_FUNNELED,&thread_level);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &P);
{
int64_t dist = new int64_t[P+1];
int64_t ptr = new int64_t[local_n+1];
if(rank==0)
ptr[0]=0; ptr[1]=1; ptr[2]=2;
if(rank==1)
ptr[0]=0; ptr[1]=1; ptr[2]=2; ptr[3]=4;
int64_t local_nnz= ptr[local_n] - ptr[0];
int64_t ind = new int64_t[local_nnz];
double* val = new double[local_nnz];
if(rank==0)
{
val[0] = 1; ind[0] = 0;
val[1] = 2; ind[1] = 1;
}
if(rank==1)
{
val[0] = 3; ind[0] = 2;
val[1] = 4; ind[1] = 3;
val[2] = 1; ind[2] = 0;
val[3] = 5; ind[3] = 4;
}
StrumpackSparseSolverMPIDist<double,int64_t> spss(MPI_COMM_WORLD);
spss.options().set_mc64job(0);
spss.options().set_from_command_line(argc, argv);
spss.options().set_reordering_method(ReorderingStrategy::PARMETIS);
auto n_local = Adist.local_rows();
std::vector
spss.set_distributed_csr_matrix
(Adist.local_rows(), Adist.get_ptr(), Adist.get_ind(),
Adist.get_val(), Adist.get_dist().data(),Adist.has_symmetric_sparsity());
spss.solve(b.data(), x.data());
if (!rank) std::cout << "x = [" << std::flush;
for (int p=0; p<rank; p++) MPI_Barrier(MPI_COMM_WORLD);
for (auto xi : x) std::cout << xi << " " << std::flush;
for (int p=rank; p<P+1; p++) MPI_Barrier(MPI_COMM_WORLD);
if (!rank) std::cout << "];" << std::endl;
auto scaled_res = Adist.max_scaled_residual(x.data(), b.data());
if (!rank)
std::cout << "# COMPONENTWISE SCALED RESIDUAL = "
<< scaled_res << std::endl;
} Cblacs_exit(1); MPI_Finalize(); return 0; }``
Maybe it's a copy/paste issue, but you write:
if(rank==0)
ptr[0]=0; ptr[1]=1; ptr[2]=2;
if(rank==1)
ptr[0]=0; ptr[1]=1; ptr[2]=2; ptr[3]=4;
I think that should be:
if(rank==0) {
ptr[0]=0; ptr[1]=1; ptr[2]=2;
}
if(rank==1) {
ptr[0]=0; ptr[1]=1; ptr[2]=2; ptr[3]=4;
}
Then it runs fine for me, no segfaults. If you still get segfaults, try using gdb, or valgrind, and let me know.
I'm so sorry for making such a low mistake and waste your precious time.I already modify program and this resuit is right.but it output information of the program that
Cannot bisect a graph with 0 vertices!
You are trying to partition a graph into too many parts!
Cannot bisect a graph with 0 vertices!
You are trying to partition a graph into too many parts!
.# initial matrix:
.# - number of unknowns = 5
.# - number of nonzeros = 7
And it display the number of nonzeros is 7.In fact there is only six nonzeros.May it keep symmetry of matrix for reordering. Is this information normal?or my program also have some problem.
Yes, our solver first makes the pattern of the matrix structurally symmetric. So it's adding an extra element in the matrix.
The message about bisecting a graph with 0 vertices comes from metis, a third party library used for the fill-reducing reordering. In this case, where the matrix is very small, it's not really needed. You can specify
--sp_reordering_method natural
to disable the reordering.
Thank you very much.I know it,I study the matrix reordering theory.Thank you
I use the testMMdoubleMPIDist64 and Modify the program, print the results of vextor(x).but results is wrong。I use one process and one thread。can you tell me that how to get the matrix output?thank you