devinamatthews / aquarius

Aquarius is a parallel quantum chemistry package built on the Cyclops Tensor Framework which provides high-performance structured tensor operations. Aquarius is primarily focused on iterative methods such as CC, CI, and EOMCC.
BSD 3-Clause "New" or "Revised" License
27 stars 11 forks source link

Task type molecule not found #21

Closed keceli closed 7 years ago

keceli commented 7 years ago

I get the following error when I run aquarius with any of the given test input files.

Running on 1 process with 1 thread

terminate called after throwing an instance of 'std::logic_error'
  what():  Task type molecule not found
Aborted

@solomonik mentioned that he has seen this error before, but couldn't remember the solution. Any ideas? Here is config.log.

devinamatthews commented 7 years ago

I'm not able to reproduce this with Intel 17.0.1 and MPICH 3.2 on either the master or stable branch. You might try inserting this code into Task::createTask (src/task/task.cxx:218):

printf("There are %ld task types:\n", tasks().size());
for (auto& task : tasks()) printf("%s\n", task.first.c_str());

This should at least tell if the problem is that the list of task types is not getting populated.

keceli commented 7 years ago

I got:

There are 0 task types:
terminate called after throwing an instance of 'std::logic_error'
  what():  Task type molecule not found
Aborted
devinamatthews commented 7 years ago

OK, so that means the static functions that add the tasks to the master list aren't getting called. What is the OS version?

keceli commented 7 years ago

Thanks Devin for helping out. config.log is attached to the first post.

13:50:31|b452|test> uname -a
Linux b452 2.6.32-696.3.2.el6.x86_64 #1 SMP Tue Jun 20 01:26:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
devinamatthews commented 7 years ago

Also, could you attach the output of nm src/input/molecule.o | c++filt and nm bin/aquarius | c++filt?

devinamatthews commented 7 years ago

The distro version could be relevant as well.

keceli commented 7 years ago
13:53:52|b452|aquarius> cat /proc/version 
Linux version 2.6.32-696.3.2.el6.x86_64 (mockbuild@c1bl.rdu2.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Tue Jun 20 01:26:55 UTC 2017
13:56:09|b452|aquarius> cat /etc/*-release
CentOS release 6.8 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch
CentOS release 6.8 (Final)
CentOS release 6.8 (Final)

nm outputs are attached. aquarius.txt molecule.txt

devinamatthews commented 7 years ago

Lets try this:

1) rm bin/aquarius 2) make V=1 3) Take the final link command and rerun after adding -v option (at the end is fine) 4) Send output

keceli commented 7 years ago

Attached.

make_out.txt

devinamatthews commented 7 years ago

Please send the output of these:

gdb -batch -ex 'file ./bin/aquarius' -ex 'disassemble __libc_csu_init' objdump -j .init_array -s bin/aquarius

keceli commented 7 years ago
16:43:16|b452|aquarius> gdb -batch -ex 'file ./bin/aquarius' -ex 'disassemble __libc_csu_init'
Dump of assembler code for function __libc_csu_init:
   0x0000000000f2c5d0 <+0>: mov    %rbp,-0x28(%rsp)
   0x0000000000f2c5d5 <+5>: mov    %r12,-0x20(%rsp)
   0x0000000000f2c5da <+10>:    lea    0x53cddf(%rip),%rbp        # 0x14693c0 <__do_global_dtors_aux_fini_array_entry>
   0x0000000000f2c5e1 <+17>:    lea    0x53cdd0(%rip),%r12        # 0x14693b8 <__init_array_start>
   0x0000000000f2c5e8 <+24>:    mov    %r13,-0x18(%rsp)
   0x0000000000f2c5ed <+29>:    mov    %r14,-0x10(%rsp)
   0x0000000000f2c5f2 <+34>:    mov    %r15,-0x8(%rsp)
   0x0000000000f2c5f7 <+39>:    mov    %rbx,-0x30(%rsp)
   0x0000000000f2c5fc <+44>:    sub    $0x38,%rsp
   0x0000000000f2c600 <+48>:    sub    %r12,%rbp
   0x0000000000f2c603 <+51>:    mov    %edi,%r13d
   0x0000000000f2c606 <+54>:    mov    %rsi,%r14
   0x0000000000f2c609 <+57>:    sar    $0x3,%rbp
   0x0000000000f2c60d <+61>:    mov    %rdx,%r15
   0x0000000000f2c610 <+64>:    callq  0x40bf80 <_init>
   0x0000000000f2c615 <+69>:    test   %rbp,%rbp
   0x0000000000f2c618 <+72>:    je     0xf2c636 <__libc_csu_init+102>
   0x0000000000f2c61a <+74>:    xor    %ebx,%ebx
   0x0000000000f2c61c <+76>:    nopl   0x0(%rax)
   0x0000000000f2c620 <+80>:    mov    %r15,%rdx
   0x0000000000f2c623 <+83>:    mov    %r14,%rsi
   0x0000000000f2c626 <+86>:    mov    %r13d,%edi
   0x0000000000f2c629 <+89>:    callq  *(%r12,%rbx,8)
   0x0000000000f2c62d <+93>:    add    $0x1,%rbx
   0x0000000000f2c631 <+97>:    cmp    %rbp,%rbx
   0x0000000000f2c634 <+100>:   jb     0xf2c620 <__libc_csu_init+80>
   0x0000000000f2c636 <+102>:   mov    0x8(%rsp),%rbx
   0x0000000000f2c63b <+107>:   mov    0x10(%rsp),%rbp
   0x0000000000f2c640 <+112>:   mov    0x18(%rsp),%r12
   0x0000000000f2c645 <+117>:   mov    0x20(%rsp),%r13
   0x0000000000f2c64a <+122>:   mov    0x28(%rsp),%r14
   0x0000000000f2c64f <+127>:   mov    0x30(%rsp),%r15
   0x0000000000f2c654 <+132>:   add    $0x38,%rsp
   0x0000000000f2c658 <+136>:   retq   
End of assembler dump.
16:43:23|b452|aquarius> objdump -j .init_array -s bin/aquarius

bin/aquarius:     file format elf64-x86-64

Contents of section .init_array:
 14693b8 b0d74000 00000000                    ..@..... 
devinamatthews commented 7 years ago

Let's try a minimal example. Please try the following commands with the attached files:

icpc -c test.cxx test2.cxx test3.cxx
icpc -o test.x test.o test2.o test3.o
./test.x

If the output doesn't contain Hi! then the linker is just busted, and you should either see if there is a newer version of binutils available or try gcc.

test.cxx:

#include <cstdio>

extern int x;

int main()
{
    printf("%d\n", x);
}

test2.cxx

#include <cstdio>

int static_function()
{
    printf("Hi!\n");
    return 0;
}

test3.cxx

int static_function();

int x = static_function();
keceli commented 7 years ago

As you guessed icpc compiled version failed to print "Hi", just printed 0. When compiled with g++, it worked fine. I don't understand what is going on with icpc. Any ideas? I am now recompiling Aquarius with ./configure MPICC=gcc MPIC=g++. Any other options do you recommend?

devinamatthews commented 7 years ago

I literally looked through the ld source code and I couldn't even figure out how it manages to work on my machine, much less not work on yours. It might have something to do with how icpc names the static initialization functions (__sti__* instead of __GLOBAL_*). Using gcc or another version of binutils is all I can think of.

Since this seems to not be a problem with aquarius per se, I am closing the issue.