nwchemgit / nwchem

NWChem: Open Source High-Performance Computational Chemistry
http://nwchemgit.github.io
Other
505 stars 161 forks source link

do concurrent (i=1:n) failing #379

Closed ebylaska closed 3 years ago

ebylaska commented 3 years ago

Describe the bug do concurrent option is failing with older gfortran

[bylaska@arrow13 src]$ gfortran -v Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)

Attach log files [bylaska@arrow13 src]$ fgrep -r "do concurrent" * ccsd/convert_single_double.F: do concurrent (i=1:n) ccsd/convert_single_double.F: do concurrent (i=1:n) ccsd/convert_single_double.F: do concurrent (i=1:n) ccsd/convert_single_double.F: do concurrent (i=1:n)

edoapra commented 3 years ago

gcc 4.4.7 was released in March of 2012, that makes it nine years old. https://gcc.gnu.org/gcc-4.4/ Do we really have to support any outcome of archeological digs?

jeffhammond commented 3 years ago

@ebylaska https://www.softwarecollections.org/en/scls/rhscl/devtoolset-8/ and related are your friends

jeffhammond commented 3 years ago

@ebylaska you appear to be using RHEL 6 (https://access.redhat.com/solutions/19458), which has been EOL for 6 months (https://access.redhat.com/discussions/4768501). I don't have access to my RHEL 6 box anymore but I recall that one can't even yum update at this point.

Do you really think I should be prevented from using Fortran 2008 because you are using an operating system more than 10 years old, which is declared obsolete by its vendors?

ebylaska commented 3 years ago

It doesn't matter to me personally, because I can always fix it when I need to. However, in my opinion using one-off language extensions isn't defensive coding. Also, the timescale of compiler development is more like +15 years so it's very likely there will be some users will run into this. FYI, we tried to keep a F77 standard long after F90 became a standard (sometime in early grad school for me if I remember correctly). An 8 year standard for deprecating software seems a bit optimistic when one looks at some parts of the nwchem tree (and even more parts of GA).

ebylaska commented 3 years ago

Does "do concurent" automatically use a GPU with nvidia or is this just threading? Seems like a large chunk of memory to move to a gpu to do a copy.

ebylaska commented 3 years ago

How long should old hardware be supported? Looks like it was ~30 years for VAX. I think they were being removed along with punchcards from MTU when I was first starting in the mid 1980's. I guess foreign markets continued to use these things (or maybe just old nwchem developers)

From wikepedia: "In August 2000, Compaq announced that the remaining VAX models would be discontinued by the end of the year.[19] By 2005 all manufacturing of VAX computers had ceased, but old systems remain in widespread use.[20]

The Stromasys CHARON-VAX and SIMH software-based VAX emulators remain available and VMS is now managed by VMS Software Incorporated, although they only offer OpenVMS for Alpha systems and HPE Integrity Servers, with x86-64 support being developed, and do not offer it for VAX."

jeffhammond commented 3 years ago

using one-off language extensions isn't defensive coding

It is an ISO Fortran 2008 feature supported by many compilers, provided one uses releases less than 7 years old.

Does "do concurent" automatically use a GPU with nvidia or is this just threading? Seems like a large chunk of memory to move to a gpu to do a copy.

Compilers implement DO CONCURRENT using SIMD, thread or GPU parallelism, depending on the compiler and the flags used. I am using it primarily for the SIMD parallelism and not the thread or GPU parallelism, which may not be beneficial for smaller inputs.

In the case where the GPU is used, these conversion routines will do double-duty as moving data to the GPU.

we tried to keep a F77 standard long after F90 became a standard

I am aware, and this was important (to me) because there was no decent free and widely available Fortran 90 compiler for a long time, and I was completely dependent on g77 for much of grad school. GCC gfortran changed that, so now everybody can use Fortran 2008 - including DO CONCURRENT and coarrays - on every CPU out there, provided they use a relatively recent release of GCC.

How long should old hardware be supported?

With older versions of NWChem. For example, if I want to run NWChem on an Itanium system, I can download NWChem 5.1.1, or whatever I was using on MPP2 back in the day.

The reason mainframe systems like VAX are supported by vendors is because of banks, militaries and other use case cases that have no overlap with computational chemistry. I'm aware of cash machines running DOS, banks running COBOL, hydroelectric dams running Ada, etc. but I don't see what bearing that has on NWChem CCSD(T) code.

jeffhammond commented 3 years ago

I should be clear: I will fix this bug if we decide that we care about GCC versions prior to 4.7, but that will prevent other improvements from happening where the workarounds are not so simple.

jeffhammond commented 3 years ago

Given that we install OpenBLAS, we could also just install GCC 5+ whenever someone tries to use GCC 4.

https://github.com/jeffhammond/HPCInfo/blob/master/buildscripts/gcc-release.sh

edoapra commented 3 years ago

fixed committed to master and successfully tested on rhel6