Open crl123 opened 4 years ago
Hi @crl123,
This means that Task Bench is computing the wrong result. I'm a little confused, I thought the Regent implementation was fully debugged.
I'm not expecting this to make a difference, but can you confirm what Task Bench branch/tag you're on?
I'll try to confirm on my end as well.
I'm on the 'origin/master' branch. I updated the repository in my local machine on this Sunday.
Ok, I'm a bit swamped with things going on this week, but I'll try to find time to verify the Regent implementation on my own machine.
Sorry for taking so long to get back to this.
Looking back at your configuration here, I don't see any settings for the network. Typically you'd use something like:
export USE_GASNET=1
export CONDUIT=aries
Otherwise what you're doing is running N copies of the single-node program. Which is probably why this is misbehaving.
Hi @elliottslaughter. I have a further question about multi-node benchmarks. Using gasnet the way you explained for a cluster with two nodes (udp conduit) creates double the number of tasks in the graph; half of the tasks is ran by node 1 and the other half by node 2. Is that the expected behaviour? Or is there a way to have the tasks be split between nodes? E.g., Given a 10x10 stencil graph, the 100 tasks would be split between two nodes.
@ysfess22 Please submit this as a new issue unless it's specifically related to the original posting.
The answer will depend on how you have configured your system, and I will require more information, which will clog this thread if it's not specifically related.
Good afternoon, I am running Regent on my cluster of 9 node with the following parameters: mpirun -np 9 -ppn 1 ./TaskBench/task-bench/regent/main.shard14 -steps 10 -type fft -kernel compute_bound -iter 1000000 And it is giving me the following problem: main.shard14: core.cc:588: void TaskGraph::execute_point(long int, long int, char*, size_t, const char*, const size_t, size_t, char*, size_t) const: Assertion `input[i].second == dep' failed.
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 6 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6) And sometimes the following problem: main.shard14: core.cc:565: void TaskGraph::execute_point(long int, long int, char*, size_t, const char*, const size_t, size_t, char*, size_t) const: Assertion `offset <= point && point < offset+width' failed.
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 6 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6) I have the same problem when I use the tree type, but when I use the stencil_1d type I don't have the problem. I compile regent as follows: DEFAULT_FEATURES=0 USE_REGENT=1 ./get_deps.sh export CXX=mpicxx export CC=mpicc ./build_all.sh Thank you in advance for your help,