APSIMInitiative / APSIM710

APSIM
https://www.apsim.info
30 stars 47 forks source link

APSIM 7.10 SIGSEGV Error #1733

Closed bberak closed 3 years ago

bberak commented 5 years ago

Hi there,

First of all, huge thanks for creating and maintinaing this software.

I'm getting a SIGSEGV error when running a .sim file in an Ubuntu container with APSIM 7.10 (revision 4191).

I've gotten the source code to successfully compile using BuildAll.sh. I can also successfully run the .apsim files in the Examples folder.

SIGSEGV Error

However, when running a particular .sim file (given to me by a client):

/apsim/Model/ApsimModel.exe /workbench/Client.sim

I get the following error:


     ###     ######     #####   #   #     #
    #   #    #     #   #        #   ##   ##
   #     #   #     #   #        #   ##   ##
   #######   ######     #####   #   # # # #
   #     #   #              #   #   #  #  #
   #     #   #         #####    #   #  #  #

 The Agricultural Production Systems Simulator
             Copyright(c) APSRU

Version                = 7.10 r4191
Title                  = Continuous Dryland Cotton
   Component                        "clock" = %apsim%/Model/Clock.so
   Component                          "met" = %apsim%/Model/Input.so
Paddock:
   Component                   "outputfile" = %apsim%/Model/Report.so
   Component                        "accum" = %apsim%/Model/Accum.so
   Component                   "fertiliser" = %apsim%/Model/Fertiliser.so
   Component            "Sowing fertiliser" = %apsim%/Model/Manager.so
   Component           "Cotton sowing rule" = %apsim%/Model/Manager.so
   Component              "Harvesting rule" = %apsim%/Model/Manager.so
   Component "Fertilise on days after sowing - top up" = %apsim%/Model/Manager.so
   Component           "PostHarvestTillage" = %apsim%/Model/Manager.so
   Component                   "Soil Water" = %apsim%/Model/SoilWat.so
   Component         "SurfaceOrganicMatter" = %apsim%/Model/SurfaceOM.so
   Component                "Soil Nitrogen" = %apsim%/Model/SoilN.so
   Component                      "tracker" = %apsim%/Model/Tracker.so
   Component                       "Cotton" = %apsim%/Model/Cotton.dll

Native stacktrace:

    /usr/lib/libmonoboehm-2.0.so.1(+0xd170a) [0x7f760f41670a]
    /usr/lib/libmonoboehm-2.0.so.1(+0x49d6c) [0x7f760f38ed6c]
    /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7f76103f54b0]
    /usr/lib/libmonoboehm-2.0.so.1(GC_push_all_eager+0x40) [0x7f760f589300]
    /usr/lib/libmonoboehm-2.0.so.1(GC_with_callee_saves_pushed+0x28) [0x7f760f5917c8]
    /usr/lib/libmonoboehm-2.0.so.1(GC_push_roots+0xbf) [0x7f760f58a7ff]
    /usr/lib/libmonoboehm-2.0.so.1(GC_mark_some+0x1c8) [0x7f760f589bc8]
    /usr/lib/libmonoboehm-2.0.so.1(GC_stopped_mark+0xa8) [0x7f760f580848]
    /usr/lib/libmonoboehm-2.0.so.1(GC_try_to_collect_inner+0xaf) [0x7f760f58112f]
    /usr/lib/libmonoboehm-2.0.so.1(GC_init_inner+0x2fa) [0x7f760f58b72a]
    /usr/lib/libmonoboehm-2.0.so.1(GC_init+0x1e) [0x7f760f58b84e]
    /usr/lib/libmonoboehm-2.0.so.1(+0x1fc665) [0x7f760f541665]
    /usr/lib/libmonoboehm-2.0.so.1(+0x1d070c) [0x7f760f51570c]
    /usr/lib/libmonoboehm-2.0.so.1(+0x4a48f) [0x7f760f38f48f]
    /apsim/Model/libProtocol.so(_ZN8protocol11Computation17InitNETFrameworksEv+0x57) [0x7f7610d255e7]
    /apsim/Model/libProtocol.so(_ZN8protocol11Computation13loadComponentERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_+0x7ec) [0x7f7610d263cc]
    /apsim/Model/libProtocol.so(_ZN8protocol11ComputationC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_S8_jj+0xe5) [0x7f7610d26a05]
    /apsim/Model/ProtocolManager.so(_ZN14ComponentAliasC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_S7_ii+0xb8) [0x7f760ccb1e08]
    /apsim/Model/ProtocolManager.so(_ZN11Coordinator12addComponentERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_S7_S7_S7_+0x13c) [0x7f760cca87cc]
    /apsim/Model/ProtocolManager.so(_ZN11Coordinator7doInit1ERKN8protocol9Init1DataE+0x564) [0x7f760ccae4c4]
    /apsim/Model/libComponentInterface.so(_ZN8protocol9Component14messageToLogicEPNS_7MessageE+0x1039) [0x7f760c9a4649]
    /apsim/Model/ProtocolManager.so(messageToLogic+0x15) [0x7f760ccb2215]
    /apsim/Model/libProtocol.so(_ZNK8protocol11Computation14messageToLogicEPKNS_7MessageE+0x2b) [0x7f7610d26c3b]
    /apsim/Model/libProtocol.so(_ZN8protocol9Transport14deliverMessageEPNS_7MessageE+0x32) [0x7f7610d251e2]
    /apsim/Model/ProtocolManager.so(_ZN11Coordinator12addComponentERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_S7_S7_S7_+0x4d2) [0x7f760cca8b62]
    /apsim/Model/ProtocolManager.so(_ZN11Coordinator7doInit1ERKN8protocol9Init1DataE+0x9a3) [0x7f760ccae903]
    /apsim/Model/libComponentInterface.so(_ZN8protocol9Component14messageToLogicEPNS_7MessageE+0x1039) [0x7f760c9a4649]
    /apsim/Model/ProtocolManager.so(messageToLogic+0x15) [0x7f760ccb2215]
    /apsim/Model/libProtocol.so(_ZNK8protocol11Computation14messageToLogicEPKNS_7MessageE+0x2b) [0x7f7610d26c3b]
    /apsim/Model/libProtocol.so(_ZN8protocol9Transport14deliverMessageEPNS_7MessageE+0x32) [0x7f7610d251e2]
    /apsim/Model/ApsimModel.exe(_ZN8protocol10Simulation4initERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x31f) [0x407a3f]
    /apsim/Model/ApsimModel.exe(_ZN8protocol10Simulation2goERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x40) [0x406cd0]
    /apsim/Model/ApsimModel.exe(RunAPSIM+0x97) [0x406837]
    /apsim/Model/ApsimModel.exe(main+0x324) [0x4041a4]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f76103e0830]
    /apsim/Model/ApsimModel.exe(_start+0x29) [0x404399]

Debug info from gdb:

=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

Aborted

I've followed a combination of your linux documentation and your Dockerfile to get my container working correctly and compiling the latest source code (revision 4191).

I've prepended the following lines to ~/.bashrc:

export APSIM=/apsim
export LD_LIBRARY_PATH=/apsim/Model:$LD_LIBRARY_PATH

I've also tried (to no avail):

export APSIM=/apsim
export LD_LIBRARY_PATH=/apsim/Model:/lib/x86_64-linux-gnu:/lib64:/usr/bin:/usr/lib:/usr/lib/mono:$LD_LIBRARY_PATH

Envrionment

OS: Ubuntu 16.04

Repository: http://apsrunet.apsim.info/svn/apsim/trunk@4191

Mono:

Mono JIT compiler version 4.2.1 (Debian 4.2.1.102+dfsg2-7ubuntu4)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
    TLS:           __thread
    SIGSEGV:       altstack
    Notifications: epoll
    Architecture:  amd64
    Disabled:      none
    Misc:          softdebug
    LLVM:          supported, not enabled.
    GC:            sgen

Any help would be greatly appreciated and please let me know if I can provide any more info!

Kind Regards!

peter-devoil commented 5 years ago

The clue in that stacktrace is the failure of Computation::InitNETFrameworks(), which is trying to load the Cotton dll.

Did the cotton dll build properly? It has a weird password requirement that you'll have to ask @hol353 for.

bberak commented 5 years ago

Thanks @peter-devoil - nice find.

Yes, the Cotton.dll did build - the BuildAll.sh command runs and completes successfully. I found the password for the Cotton.dll build buried somewhere in the SVN artefacts.

peter-devoil commented 5 years ago

OK, so cotton is the first .NET component it's trying to load, and it doesnt look like it's even got to call the initialisation routine within the component.

Did the standard example cotton simulation run? If the person who gave you the sim file has given you an old-style .sim file, then much weirdness will happen during initialisation - incomplete constructors etc..

I'd suggest you update your mono to 4.8. (5.x seems to have minor troubles with apsim's vb components, and scalability in HPC environments.)

If the problem persists, I'd look at strace output. Then valgrind. But if you get there, you're in a bad place.

bberak commented 5 years ago

Thanks @peter-devoil

I'll try run the standard cotton example and see how it goes. I used ApsimToSim.exe to extract the .sim file first (for debugging purposes). The .sim file runs without error on Windows.

I'll try update to mono 4.8. Thanks a lot for your pointers!

peter-devoil commented 5 years ago

Just remember that sim files made under windows will be different than those made under linux - the native components are called ".so" cf ".dll" on windows.

tdonovic commented 3 years ago

@bberak how did you go? I also have a similar issue, using 7.10-4215 and 7.10-4194. I can successfully run other apsim files, but when I try to run the example cotton files (or indeed even my own cotton apsims), I get the same output as you, including the suspect cotton.dll except with no stack trace or anything. Should I expect it to use the .dll under mono with linux? All of the other components are .so, and as a result I am a bit suss.

I ran strace, but when I grep for cotton in the output, I dont see it at all. Bit stumped. Any tips for chasing this down?

peter-devoil commented 3 years ago

The .NET dlls have dll extensions - indeed the same dlls compiled under windows will also run on linux.

Did you come across this issue?

tdonovic commented 3 years ago

I will give this a go. I get nothing below and including Native stacktrace: from OPs log, so unsure if its the same, but I'll give it a try

bberak commented 3 years ago

Sorry @tdonovic but I was pulled into another project shortly after I posted this question. I suspect that the problem I encountered was due to the .sim files being created under Windows (as @peter-devoil pointed out).

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the last 30 days. It will be closed in one week if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue is being closed because there has been no recent activity. Feel free to re-open or open a new issue if needed.