GMAP / NPB-CPP

The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures
Other
45 stars 21 forks source link

When make with CLASS=E, the program can not be executed. Segment fault will happen. #3

Open luckyq opened 10 months ago

luckyq commented 10 months ago

The cases are CG,MG, LU. I didn't try the others.

gabriellaraujocoding commented 10 months ago

Hi.

Class E has a significantly large workload. It is possible that your machine does not have enough resources to execute Class E.

Could you try to compile and run the EP benchmark with Class E? The EP benchmark consumes fewer memory resources.

sherrywong1220 commented 8 months ago

I encountered a similar issue when using CLASS=E, where the size of objects becomes significantly large, leading to integer overflow. To address this, I switched to uint64_t for types to prevent overflow. This change involves updating all index-related variables and allocation size constant from int to uint64_t. Below is an example of how you can implement these changes:

// #define NZ (NA*(NONZER+1)*(NONZER+1))
#define NZ (static_cast<uint64_t>(NA) * static_cast<uint64_t>(NONZER + 1) * static_cast<uint64_t>(NONZER + 1))

static void makea(uint64_t n,
        uint64_t nz,
        double a[],
        uint64_t colidx[],
        uint64_t rowstr[],
        uint64_t firstrow,
        uint64_t lastrow,
        uint64_t firstcol,
        uint64_t lastcol,
        uint64_t arow[],
        uint64_t acol[][NONZER+1],
        double aelt[][NONZER+1],
        uint64_t iv[]);
static void sparse(double a[],
        uint64_t colidx[],
        uint64_t rowstr[],
        uint64_t n,
        uint64_t nz,
        uint64_t nozer,
        uint64_t arow[],
        uint64_t acol[][NONZER+1],
        double aelt[][NONZER+1],
        uint64_t firstrow,
        uint64_t lastrow,
        uint64_t nzloc[],
        double rcond,
        double shift);
HodBadichi commented 3 months ago

@sherrywong1220 This initiates a chain reaction of changes that ultimately result in failure (producing NaN results in the norm after some time). Could you please share all the necessary modifications?

gabriellaraujocoding commented 2 months ago

Hello and thank you for your report @luckyq, @sherrywong1220, and @HodBadichi.

Indeed, NPB-CPP is facing overflow issues when dealing with larger workloads like class E.

We resolved this problem in the IS benchmark by introducing macros, as shown in the following code snippet:

// defining a macro for integers
#if CLASS == 'D'
    typedef long INT_TYPE;
#else
    typedef int INT_TYPE;
#endif

// using the macro throughout the code
void rank(int iteration) {
    INT_TYPE i, k;
    INT_TYPE *key_buff_ptr, *key_buff_ptr2;
    ...
}

Thus, you need to replace the type of variables with overflow problems with macros.

We will provide these fixes as soon as possible. While our fixes are not available, you can try my suggestion.