planetarium / libplanet

Blockchain in C#/.NET for on-chain, decentralized gaming
https://docs.libplanet.io/
GNU Lesser General Public License v2.1
506 stars 141 forks source link

`System.Threading.ThreadAbortException` occurs on `linux-unity-test` #2569

Open OnedgeLee opened 1 year ago

OnedgeLee commented 1 year ago

System.Threading.ThreadAbortException occurs on various tests of linux-unity-test.

Below are some of them.

FAIL Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest.MineBlock: 3.182188s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Object.MemberwiseClone(object)
    at System.Array.Clone () [0x00000] in <a1e9f114a6e64f4eacb529fc802ec93d>:0 
    at Org.BouncyCastle.Math.BigInteger.AddToMagnitude (System.Int32[] magToAdd) [0x00055] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest.StateAfterForkingAndAddingExistingBlock: 0s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Array.FastCopy(System.Array,int,System.Array,int,int)
    at System.Array.Copy (System.Array sourceArray, System.Int32 sourceIndex, System.Array destinationArray, System.Int32 destinationIndex, System.Int32 length) [0x00081] in <a1e9f114a6e64f4eacb529fc802ec93d>:0 
    at Org.BouncyCastle.Math.BigInteger.ShiftRight (System.Int32 n) [0x00056] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blocks.PreEvaluationBlockTest.SafeConstructorWithPreEvaluationHash: 0s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Object.__icall_wrapper_ves_icall_object_new_specific(intptr)
    at Org.BouncyCastle.Math.EC.ECPoint.ImplIsValid (System.Boolean decompressed, System.Boolean checkOrder) [0x0000a] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Store.BlockSetTest.CanStoreItem: 0s
  System.Threading.ThreadAbortException: 
    at Libplanet.Tests.Store.MemoryStoreFixture..ctor (Libplanet.Action.IAction blockAction) [0x00000] in /mnt/ramdisk/Libplanet.Tests/Store/MemoryStoreFixture.cs:10
FAIL Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest.GetBlockLocator: 0.3993193s
  System.Threading.ThreadAbortException: 
    at Org.BouncyCastle.Math.BigInteger.Equals (System.Object obj) [0x00006] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.BlockChainTest.FindNextHashesAfterFork: 0s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Object.__icall_wrapper_ves_icall_object_new_specific(intptr)
    at Org.BouncyCastle.Math.EC.FpFieldElement.Multiply (Org.BouncyCastle.Math.EC.ECFieldElement b) [0x0001e] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.BlockChainTest.GetNextTxNonceWithStaleTx: 0s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Object.__icall_wrapper_ves_icall_object_new_specific(intptr)
    at Org.BouncyCastle.Math.BigInteger.AddToMagnitude (System.Int32[] magToAdd) [0x0006c] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.BlockChainTest.Fork: 1.1873418s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Object.__icall_wrapper_ves_icall_array_new_specific(intptr,int)
    at Org.BouncyCastle.Math.BigInteger.ShiftRight (System.Int32 n) [0x00043] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest.MineBlockWithMaxTransactions: 3.2209828s
  System.Threading.ThreadAbortException: 
    at (wrapper managed-to-native) System.Array.FastCopy(System.Array,int,System.Array,int,int)
    at System.Array.Copy (System.Array sourceArray, System.Int32 sourceIndex, System.Array destinationArray, System.Int32 destinationIndex, System.Int32 length) [0x00081] in <a1e9f114a6e64f4eacb529fc802ec93d>:0 
    at Org.BouncyCastle.Math.BigInteger.LastNBits (System.Int32 n) [0x0002a] in <a3172f9b2f4a45059d64d4fd49357606>:0
FAIL Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest.GetBlockLocator: 0.3993193s
  System.Threading.ThreadAbortException: 
    at Org.BouncyCastle.Math.BigInteger.Equals (System.Object obj) [0x00006] in <a3172f9b2f4a45059d64d4fd49357606>:0 
    at Org.BouncyCastle.Math.EC.FpFieldElement.ModReduce (Org.BouncyCastle.Math.BigInteger x) [0x00056] in <a3172f9b2f4a45059d64d4fd49357606>:0

Most of them are aborted on operation of Org.BouncyCastle.Math.BigInteger, but reason is not identified yet. Suspect that managed c++ dll does not work properly, but couldn't figure out the reason.

longfin commented 1 year ago

After digging, I found the main blocker.

https://github.com/planetarium/xunit-unity-runner/blob/30c8aa9605a6b0a818b582d93ef4ed500f382d70/Assets/Scripts/EntryPoint.cs#L439

Most of the weird xunit-unity tests don't just hang while running, but because the app(i.e., /tmp/xur/StandaloneLinux64) doesn't shut down properly after running. Also when I connected to Circle CI via SSH and repeatedly executed the command, I was able to confirm that the test never ended.

root@5dd0eff5feda:/# /tmp/xur/StandaloneLinux64 --hang-seconds=60 --parallel=1 --report-xml-path=/mnt/ramdisk/.xur.xml --distributed=1/4 --distributed-seed=27659 --exclude-class=Libplanet.Tests.Blockchain.Renderers.AnonymousActionRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.AnonymousRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.DelayedActionRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.DelayedRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.LoggedActionRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.LoggedRendererTest --exclude-class=Libplanet.Tests.Blockchain.Renderers.NonblockRendererTest --exclude-class=Libplanet.Tests.Store.MemoryStoreTest --exclude-class=Libplanet.Tests.Blockchain.DefaultStoreBlockChainTest --exclude-class=Libplanet.Tests.Blockchain.BlockChainTest --exclude-class=Libplanet.Tests.Blockchain.Policies.BlockPolicyTest --exclude-class=Libplanet.Tests.Blockchain.Policies.VolatileStagePolicyTest --exclude-class=Libplanet.Tests.Store.BlockSetTest --exclude-class=Libplanet.Tests.Store.StoreTrackerTest --exclude-class=Libplanet.Tests.Blocks.PreEvaluationBlockTest --exclude-class=Libplanet.Tests.Blocks.PreEvaluationBlockHeaderTest --exclude-class=Libplanet.Tests.Blocks.BlockContentTest --exclude-class=Libplanet.Tests.Blocks.BlockMetadataExtensionsTest --exclude-class=Libplanet.Tests.Blocks.BlockMetadataTest /mnt/ramdisk/Libplanet.Analyzers.Tests/bin/Release/net47/Libplanet.Analyzers.Tests.dll /mnt/ramdisk/Libplanet.Node.Tests/bin/Release/net47/Libplanet.Node.Tests.dll /mnt/ramdisk/Libplanet.Tests/bin/Release/net47/Libplanet.Tests.dll

...
The result report is written: /mnt/ramdisk/.xur.xmlSetting up 18 worker threads for Enlighten.
  Thread -> id: 7f21b4c00700 -> priority: 1
  Thread -> id: 7f21b43ff700 -> priority: 1
  Thread -> id: 7f21b3bfe700 -> priority: 1
  Thread -> id: 7f21b33fd700 -> priority: 1
  Thread -> id: 7f21b2bfc700 -> priority: 1
  Thread -> id: 7f21b23fb700 -> priority: 1
  Thread -> id: 7f21b1bfa700 -> priority: 1
  Thread -> id: 7f21b13f9700 -> priority: 1
  Thread -> id: 7f21b0bf8700 -> priority: 1
  Thread -> id: 7f215ffff700 -> priority: 1
  Thread -> id: 7f215f7fe700 -> priority: 1
  Thread -> id: 7f215effd700 -> priority: 1
  Thread -> id: 7f215e7fc700 -> priority: 1
  Thread -> id: 7f215dffb700 -> priority: 1
  Thread -> id: 7f215d7fa700 -> priority: 1
  Thread -> id: 7f215cff9700 -> priority: 1
  Thread -> id: 7f2157fff700 -> priority: 1
  Thread -> id: 7f21577fe700 -> priority: 1
PASS Libplanet.Tests.Store.Trie.CacheableKeyValueStoreTest.GetMany: 0.0198872s
PASS Libplanet.Tests.Store.Trie.CacheableKeyValueStoreTest.DeleteMany: 0.0009756s
PASS Libplanet.Tests.Action.PolymorphicActionTest.DuplicateTypeId: 0.0075605s
PASS Libplanet.Tests.Action.PolymorphicActionTest.TextPlainValue: 0.0004354s
FAIL Libplanet.Tests.Action.PolymorphicActionTest.LoadPlainValue: 0.0029435s
  System.Threading.ThreadAbortException:

The result report is written: /mnt/ramdisk/.xur.xml is this line.

longfin commented 1 year ago

And more, when I tried to quit the app by sending a signal with Ctrl+C , the following dump was dropped.

SKIP Libplanet.Tests.Action.Sys.TransferTest.JsonSerialization: System.Text.Json 6.0.0+ does not work well with Unity/Mono.

^CCaught fatal signal - signo:11 code:1 errno:0 addr:0x154
Obtained 14 stack frames.
#0  0x007f2298076730 in funlockfile
#1  0x007f229913165e in SignalHandler(int, siginfo_t*, void*)
#2  0x007f2298076730 in funlockfile
#3  0x007f229807200c in pthread_cond_wait
#4  0x007f229792a7fb in mono_gchandle_free
#5  0x007f229793b02c in mono_thread_info_detach
#6  0x007f22978f3b3d in mono_security_set_mode
#7  0x007f22978f5ed3 in mono_thread_manage
#8  0x007f22977c3cbb in mono_jit_cleanup
#9  0x007f22990b970a in CleanupMono()
#10 0x007f2298f5a4bc in PlayerCleanup(bool)
#11 0x007f229913159a in PlayerMain(int, char**)
#12 0x007f2297ec809b in __libc_start_main
#13 0x0056042d7c3029 in _start
Segmentation fault (core dumped)
longfin commented 1 year ago

https://github.com/planetarium/xunit-unity-runner/blob/30c8aa9605a6b0a818b582d93ef4ed500f382d70/Assets/Scripts/EntryPoint.cs#L404

it seems that we should wait sink.Finished 🤔 cc @dahlia

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.