msg-byu / symlib

Spacegroup finder. Includes symmetry-related routines for cluster expansion and other codes that rely on symmetries of lattices and crystals.
MIT License
12 stars 15 forks source link

Fixed a bug in the smith normal form routines. #23

Closed wsmorgan closed 5 years ago

wsmorgan commented 5 years ago

Changed:

       if (all(check((/2,3,4,6,7,8/))==0) .and. mod(M(2,2),M(1,1))==0 .and. &
            mod(M(3,3),M(2,2))==0) then
          is_snf = .True.
       end if

to:

       if (all(check((/2,3,4,6,7,8/))==0) .and. all(check((/1,5,9/))/=0)) then
          if (mod(M(2,2),M(1,1))==0 .and. mod(M(3,3),M(2,2))==0) then
             is_snf = .True.
          end if
       end if

@JohnEdChristensen found this bug while testing GRkgridgen.

glwhart commented 5 years ago

I don't see this as a bug per se. The restrictions on the failsafe have been broadened, but as far as I know, this failsafe has never triggered. Was this change made because a bug was found that the failsafe failed to capture? (If so, the triggering bug should be fixed.) Or was it simply recognized that the failsafe could be strengthened?

wsmorgan commented 5 years ago

While @JohnEdChristensen was running the latest version of the GRkgridgen code he ran into the following:

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f72f51087df in ???
#1  0x5637070a6804 in __rational_mathematics_MOD_smithnormalform_li
        at /home/john/msg/temp/GRkgridgen/symlib/src/rational_mathematics.f90:93
#2  0x56370707fb44 in __kpointgeneration_MOD_symmetryreducekpointlist
        at /home/john/msg/temp/GRkgridgen/kgridGen/src/kpointgeneration.f90:508
#3  0x56370708499d in __kpointgeneration_MOD_generateirredkpointlist
        at /home/john/msg/temp/GRkgridgen/kgridGen/src/kpointgeneration.f90:194
#4  0x563706fe0462 in __grid_utils_MOD_grid_selection
        at /home/john/msg/temp/GRkgridgen/src/grid_utils.f90:314
#5  0x5637070393f6 in __sp_hnfs_MOD_bct_6_7_15_18
        at /home/john/msg/temp/GRkgridgen/src/sp_hnfs.f90:1600
#6  0x563707066330 in __find_kgrids_MOD_find_grid
        at /home/john/msg/temp/GRkgridgen/src/find_kgrids.f90:411
#7  0x56370707e168 in lat_id_driver
        at /home/john/msg/temp/GRkgridgen/src/driver.f90:25
#8  0x56370707ebb4 in main
        at /home/john/msg/temp/GRkgridgen/src/driver.f90:2
[1]    6679 floating point exception (core dumped)  kpoints.x

This occurred because in the process of converting an HNF into an SNF one of the diagonal of the matrix was set to zero during the operations. When the matrix then hit the check it attempted to take mod(4,0), i..e., it was dividing by zero, and caused a fault to trigger. So I see that as a bug but that's just semantics.

Important point is that without this change the code dies in some cases, with this check it does not.

JohnEdChristensen commented 5 years ago

This error was affecting quite a few of my test cases, they all pass with this fix.

glwhart commented 5 years ago

But the failsafe is just catching a bug that is occurring elsewhere. It should never be the case that the SNF code returns a matrix that is not in SNF form. If the failsafe catches a case where something is not in SNF form, then the SNFer itself has a bug, or was given bad input and didn't catch it (also a bug, I would say).

The question we should ask is, why did the SNFer try to return a matrix that had zeros on the diagonal. That is wrong...

Gus Hart Professor, Physics and Astronomy http://msg.byu.edu Dean, Physical and Mathematical Sciences http://cpms.byu.edu

On Wed, Aug 21, 2019 at 10:59 AM John Christensen notifications@github.com wrote:

This error was affecting quite a few of my test cases, they all pass with this fix.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/msg-byu/symlib/pull/23?email_source=notifications&email_token=AB3UNGFKUBDSQWCMEXLD7JDQFVYAXA5CNFSM4IOBKUU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD42MH7Q#issuecomment-523551742, or mute the thread https://github.com/notifications/unsubscribe-auth/AB3UNGCJVAH26H3WP2ECXVLQFVYAXANCNFSM4IOBKUUQ .

wsmorgan commented 5 years ago

This check is the exit condition for the loop that's converting the HNF into SNF, i.e., it's going to keep doing row/col operations until those conditions are true. It is expected that the matrix will not be SNF most of the time when these lines of code get run.