kdahlquist / GRNmap

Gene Regulatory Network modeling and parameter estimation
BSD 3-Clause "New" or "Revised" License
4 stars 3 forks source link

Investigate differing results on different computers #323

Closed dondi closed 7 years ago

dondi commented 7 years ago

We have had a lingering issue where computations turn out differently even with the same input. This link seems to provide the first plausible explanation:

https://www.mathworks.com/matlabcentral/answers/438-why-are-computational-results-in-matlab-sometimes-different-on-different-machines-and-how-can-i-prev

The specific quote is in a reply lower in the thread which says:

Using 80 bit registers for double precision on a machine that can only store 64 bit double precision values is considered enough of a violation of the abstract semantics as to be formally forbidden (but it happens in practice.) [Note: IEEE 754 has defined 80 bit "extended precision" arithmetic... provided the system has a mechanism to store all 80 bits.] One of the issues with 80 bit registers is that if there is an interrupt or normal change of process context to run a competing routine, then the 80 bit registers have to be spilled to memory... as 64 bit quantities. Thus depending exactly when interrupts or context changes occurred, calculations could come out differently.

dondi commented 7 years ago

Two possibilities for checking this scenario:

dondi commented 7 years ago

Another lead is to see if Matlab can be made to run computations in a GPU, thus potentially preserving full precision throughout, independent of operating system context switches.

https://www.mathworks.com/matlabcentral/answers/1032-why-do-some-calculations-like-the-fft-produce-different-results-when-performed-on-a-gpu

kdahlquist commented 7 years ago

So, this leads to a research question that we can answer given our resources.

  1. Run sensitivity analysis like in the Dahlquist et al. (2015) paper on the current version to see which parameters are sensitive.

  2. Run the model on lots of different computers to get the different results and then figure out whether the parameters that vary are the ones that are particularly sensitive in the sensitivity analysis.

kdahlquist commented 7 years ago

Does the "forward simulation only" generate different numbers?

kdahlquist commented 7 years ago

Get the Story about Ping http://linus.lmu.edu/search/t?SEARCH=story+about+ping&sortdropdown=- by Marjorie Flack and read review on Amazon.

kdahlquist commented 7 years ago

Wait--if we run the same input workbook on the same computer with the same version of the software, we get the same result. If what was described above was the problem, wouldn't we get different results on even the same computer?

kdahlquist commented 7 years ago

So I re-ran the input workbook I used for the v1.6 tests on boulardii 2 with v1.4.4 and actually got the same results as 1.6, but not the same as the output @bklein7 posted to the DahlquistLab repo.

I think I need to take on the job of sorting this out myself and let the data analysis team do their work. I'll try to start running some tests on a dedicated computer and input workbook as soon as I can get it set up in my office.

kdahlquist commented 7 years ago

This issue was first discussed as #146. I'm going to close that one, but remind myself here that there is useful discussion there to go back to when doing my tests.

kdahlquist commented 7 years ago

I have initiated a round of tests and am documenting what happens on OpenWetWare here:

http://www.openwetware.org/wiki/Dahlquist:GRNmap_Testing

Relevant files will be stored in the DahlquistLab repository here:

https://github.com/kdahlquist/DahlquistLab/tree/master/data/kdahlquist_testing_20170216

kdahlquist commented 7 years ago

So, my first test on the department laptop crashed, with what I think is a java error. Error message pasted below, it says the java version on this machine is 1.7, so I'm guessing that I might need to update it to Java 1.8. I did run it under the student login (non-administrator).


         Assertion detected at Thu Feb 16 14:31:53 2017

Configuration: Crash Decoding : Disabled Default Encoding : windows-1252 Graphics card 1 : SEIKO EPSON CORPORATION ( 0x0 ) EPSON Projector Support Driver for NP Version 1.0.0.0 Graphics card 2 : Intel Corporation ( 0x8086 ) Intel(R) HD Graphics 4000 Version 9.17.10.2932 Java Crash Report : C:\Users\Student\AppData\Local\Temp\hs_error_pid2100.log MATLAB Architecture: win64 MATLAB Root : C:\Program Files\MATLAB\R2014b MATLAB Version : 8.4.0.150421 (R2014b) Operating System : Microsoft Windows 7 Enterprise Processor ID : x86 Family 6 Model 58 Stepping 9, GenuineIntel Software OpenGL : 0 Virtual Machine : Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode Window System : Version 6.1 (Build 7601: Service Pack 1)

Fault Count: 1

Assertion in void __cdecl `anonymous-namespace'::mwJavaAbort(void) at b:\matlab\src\jmi\javainit.cpp line 1319: Fatal Java Exception. See Java Crash Report for details.

Register State (captured): RAX = 00000000043eea01 RBX = 00000000abfea8b0 RCX = 00000000abfea280 RDX = 0000000000000000 RSP = 00000000abfe9df0 RBP = 00000000116f5880 RSI = 00000000043dea60 RDI = 00000000043eeab0

R8 = 000007fffff70000 R9 = 000007fef46a0000 R10 = 00000000043deab0 R11 = 00000000043deab0 R12 = 000000005d2e3179 R13 = 00000000abfeaec0 R14 = 00000000116f5880 R15 = 0000000033bb0000

RIP = 000000000421432a EFL = 00000206

CS = 0033 FS = 0053 GS = 002b

Stack Trace (captured): [ 0] 0x000000000421432a C:\Program Files\MATLAB\R2014b\bin\win64\libmwfl.dll+00082730 fl::diag::windows::context_base::capture_data+00000010 [ 1] 0x0000000004210bb4 C:\Program Files\MATLAB\R2014b\bin\win64\libmwfl.dll+00068532 fl::diag::thread_context::unspecified_bool+00006628 [ 2] 0x00000000042105ab C:\Program Files\MATLAB\R2014b\bin\win64\libmwfl.dll+00066987 fl::diag::thread_context::unspecified_bool+00005083 [ 3] 0x0000000004213dbe C:\Program Files\MATLAB\R2014b\bin\win64\libmwfl.dll+00081342 fl::diag::terminate+00000110 [ 4] 0x00000000116a27c7 C:\Program Files\MATLAB\R2014b\bin\win64\jmi.dll+00468935 mljShutdown+00000439 [ 5] 0x000000005d000aed C:\Program Files\MATLAB\R2014b\sys\java\jre\win64\jre\bin\server\jvm.dll+02165485 JVM_FindSignal+00002525 [ 6] 0x000000005cff3c29 C:\Program Files\MATLAB\R2014b\sys\java\jre\win64\jre\bin\server\jvm.dll+02112553 JVM_ResolveClass+00461817 [ 7] 0x000000005d0017d6 C:\Program Files\MATLAB\R2014b\sys\java\jre\win64\jre\bin\server\jvm.dll+02168790 JVM_FindSignal+00005830 [ 8] 0x000000005d00597c C:\Program Files\MATLAB\R2014b\sys\java\jre\win64\jre\bin\server\jvm.dll+02185596 JVM_FindSignal+00022636 [ 9] 0x000000005d092e58 C:\Program Files\MATLAB\R2014b\sys\java\jre\win64\jre\bin\server\jvm.dll+02764376 JVM_FindSignal+00601416 [ 10] 0x0000000077507e8d C:\Windows\SYSTEM32\ntdll.dll+00163469 RtlDecodePointer+00000173 [ 11] 0x00000000774f84cf C:\Windows\SYSTEM32\ntdll.dll+00099535 RtlUnwindEx+00003007 [ 12] 0x000000007752bac8 C:\Windows\SYSTEM32\ntdll.dll+00309960 KiUserExceptionDispatcher+00000046 [ 13] 0x000007fee6ce3cae C:\Windows\system32\ig7icd64.dll+07617710 DrvSetCallbackProcs+04291070 [ 14] 0x000007fee6763f86 C:\Windows\system32\ig7icd64.dll+01851270 DllMain+01284134 [ 15] 0x000007fee6763c98 C:\Windows\system32\ig7icd64.dll+01850520 DllMain+01283384 [ 16] 0x00000000e51560d7 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osg.dll+01663191 osg::Program::apply+00000151 [ 17] 0x00000000e4ffbe56 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osg.dll+00245334 osg::State::applyAttribute+00000118 [ 18] 0x00000000e4ffc25e C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osg.dll+00246366 osg::State::applyAttributeList+00000830 [ 19] 0x00000000e5183699 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osg.dll+01848985 osg::State::apply+00000153 [ 20] 0x00000000e557b119 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01159449 osgUtil::RenderLeaf::render+00000185 [ 21] 0x00000000e55797c2 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01152962 osgUtil::RenderBin::drawImplementation+00000306 [ 22] 0x00000000e5579739 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01152825 osgUtil::RenderBin::drawImplementation+00000169 [ 23] 0x00000000e5580c33 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01182771 osgUtil::RenderStage::drawImplementation+00000531 [ 24] 0x00000000e5580d4d C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01183053 osgUtil::RenderStage::drawInner+00000237 [ 25] 0x00000000e5580856 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01181782 osgUtil::RenderStage::draw+00000806 [ 26] 0x00000000e5581403 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01184771 osgUtil::RenderStage::drawPostRenderStages+00000083 [ 27] 0x00000000e5580971 C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01182065 osgUtil::RenderStage::draw+00001089 [ 28] 0x00000000e5589aec C:\Program Files\MATLAB\R2014b\bin\win64\osg80-osgUtil.dll+01219308 osgUtil::SceneView::draw+00004204 [ 29] 0x00000000dbc9ab61 C:\Program Files\MATLAB\R2014b\bin\win64\osgserver.dll+00830305 graphics::primitive::world::osg::ModelFile::updateBounds+00751377 [ 30] 0x00000000dbca2ebc C:\Program Files\MATLAB\R2014b\bin\win64\osgserver.dll+00863932 graphics::primitive::world::osg::ModelFile::updateBounds+00785004 [ 31] 0x000000005af5a2af C:\Program Files\MATLAB\R2014b\bin\win64\libuij.dll+00041647 UIJ_call_OpenGLPaintFcn+00000015 [ 32] 0x00000000f89b2d3a C:\Program Files\MATLAB\R2014b\bin\win64\nativehg.dll+00011578 Java_com_mathworks_hg_peer_JavaSceneServerPeer_doDisplay+00000042 [ 33] 0x0000000033bc23a8 +00000000 [ 34] 0x00000000abfecc48 +00000000 [ 35] 0x0000000033bb61f8 +00000000 [ 36] 0x0000000014345100 +00000000 [ 37] 0x0000000033bbf1d8 +00000000 [ 38] 0x0000000000000001 +00000000 [ 39] 0x0000002300000030 +00000000 [ 40] 0x00000000abfecc48 +00000000 [ 41] 0x00000007fa8a4f30 +00000000 [ 42] 0x00000007fa8a4ed8 +00000000 [ 43] 0x0000000033c0eb83 +00000000 [ 44] 0x00000000abfecc00 +00000000

If this problem is reproducible, please submit a Service Request via: http://www.mathworks.com/support/contact_us/

A technical support engineer might contact you with further information.

Thank you for your help.

kdahlquist commented 7 years ago

To my previous question--@dondi says that different computers have different software profiles which could affect things.

kdahlquist commented 7 years ago

As of v1.8, it appears that this issue is no longer a problem. I have run L-curves with identical input workbooks on two different computers and go the identical output for the LSE, penalty, and iteration count values. This has been repeated twice with two different input workbooks and two different pairs of computers. I haven't taken the difference of each set of output parameter values, but it would be unlikely that they are different given the LSE. I'm going to go ahead and close this one, but we should monitor it to see if it comes back.