Closed goinea closed 3 years ago
~Please do not attempt to run on erpfen01
. The build and run scripts were not designed to work there. Follow the instructions for building and running the job submission scripts on erp14
.~ Hold on... this is erp01
.
~Would you please paste the commands you used to build and run? Please be sure to include which node the commands were executed on.~
I think you have a string mismatch when doing tag operations. Specifically, this is the error message to pay attention to:
!mds_find_tag(&mesh->tags, name) failed at /tmp/CCNIsmth/spack-stage/spack-stage-pumi-master-mktotssjhxnlmwvuwbaamj6tjyyroydk/spack-src/mds/apfMDS.cc + 421
It can't find the tag you are asking for using the name given.
On 2021-04-09 09:00, Cameron Smith wrote:
Please do not attempt to run on erpfen01. The build and run scripts were not designed to work there.
Follow the instructions for building and running the job submission scripts on erp14.
-- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2].
Links:
[1] https://github.com/SCOREC/fep/issues/30#issuecomment-816664367 [2] https://github.com/notifications/unsubscribe-auth/ASZY67TJH4GBDJ35GQVBYPLTH326VANCNFSM42UH5OKQ Cameron, Okay Best, -- Adam Goines Rensselaer Polytechnic Institute
@goinea I misread the info you posted and gave an incorrect response. Please look on github at the edits; those comments should be more helpful.
On 2021-04-09 09:00, Cameron Smith wrote:
Please do not attempt to run on erpfen01. The build and run scripts were not designed to work there.
Follow the instructions for building and running the job submission scripts on erp14.
-- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub [1], or unsubscribe [2].
Links:
[1] https://github.com/SCOREC/fep/issues/30#issuecomment-816664367 [2] https://github.com/notifications/unsubscribe-auth/ASZY67TJH4GBDJ35GQVBYPLTH326VANCNFSM42UH5OKQ Cameron, I have ssh'ed into erp14 but my error still shows for erp 9. Photos are attached Error: !mds_find_tag(&mesh->tags, name) failed at /tmp/CCNIsmth/spack-stage/spack-stage-pumi-master-mktotssjhxnlmwvuwbaamj6tjyyroydk/spack-src/mds/apfMDS.cc
- 421 srun: error: erp09: task 0: Aborted
=== mesh size and tag info ===
(p0) # local ent: v 49, e 84, f 36, r 0 (p0) # own ent: v 49, e 84, f 36, r 0
mesh shape: "Linear"
tag 0: "coordinates_ver", type 0, size 3
tag 1: "coordinates_edg", type 0, size 3
[erp09:113410] Process received signal
[erp09:113410] Signal: Aborted (6)
[erp09:113410] Signal code: (-6)
[erp09:113410] [ 0] /usr/lib64/libpthread.so.0(+0xf630)[0x7fc34eef0630]
[erp09:113410] [ 1] /usr/lib64/libc.so.6(gsignal+0x37)[0x7fc34eb493d7]
[erp09:113410] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x7fc34eb4aac8]
[erp09:113410] [ 3] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x513045]
[erp09:113410] [ 4] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x47636e]
[erp09:113410] [ 5] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x47b80a]
[erp09:113410] [ 6]
/gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2(main+0x573)[0x465dd3]
[erp09:113410] [ 7]
/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fc34eb35555]
[erp09:113410] [ 8] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x466d3f]
[erp09:113410] End of error message
~
~
~
~
"slurm-17974.out" 24L, 1265C
Best, -- Adam Goines Rensselaer Polytechnic Institute
I see the tag error and will address it but do you understand why that would still output an error on erp 1 and erp9 instead of erp14?
The tag error will happen on any of the compute nodes. It appears to be a bug in your code.
Okay, thanks.
Summary
I have a signal aborted error when running a program that I am not able to find on the internet for erp01.
Details
This issue has, what appears to be, eight locations
Code
This is the message that I get: Text: !mds_find_tag(&mesh->tags, name) failed at /tmp/CCNIsmth/spack-stage/spack-stage-pumi-master-mktotssjhxnlmwvuwbaamj6tjyyroydk/spack-src/mds/apfMDS.cc + 421 srun: error: erp01: task 0: Aborted
=== mesh size and tag info ===
global ent: v 49, e 84, f 36, r 0
(p0) # local ent: v 49, e 84, f 36, r 0 (p0) # own ent: v 49, e 84, f 36, r 0
mesh shape: "Linear" tag 0: "coordinates_ver", type 0, size 3 tag 1: "coordinates_edg", type 0, size 3 [erp01:95695] Process received signal [erp01:95695] Signal: Aborted (6) [erp01:95695] Signal code: (-6) [erp01:95695] [ 0] /usr/lib64/libpthread.so.0(+0xf630)[0x7fb4bb734630] [erp01:95695] [ 1] /usr/lib64/libc.so.6(gsignal+0x37)[0x7fb4bb38d3d7] [erp01:95695] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x7fb4bb38eac8] [erp01:95695] [ 3] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x513045] [erp01:95695] [ 4] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x47636e] [erp01:95695] [ 5] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x47b80a] [erp01:95695] [ 6] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2(main+0x573)[0x465dd3] [erp01:95695] [ 7] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fb4bb379555] [erp01:95695] [ 8] /gpfs/u/home/FEP5/FEP5gnsd/a2/./build/a2[0x466d3f] [erp01:95695] End of error message
Have you seen this before? If so where should I look for resolution?