ARM 32-bit progress - Githubissues

manu-st commented 8 years ago

I'm opening an issue to track the progress of ARM 32-bit with respects to the regression tests.

Currently I'm getting the following results on Ubuntu 14.04 running on a NVidia Jetson TK1. This is unfortunately not the results you get from a clean checkout, the only difference is that it contains a fix for issue dotnet/runtime#5422 (PR dotnet/coreclr#3879), but overall it should provide a good base to monitor our progress:

=======================
     Test Results
=======================
# Tests Discovered : 6018
# Passed           : 4313
# Failed           : 1361
# Skipped          : 344
=======================

I've attached the XML results: coreclrtests.zip

leemgs commented 8 years ago

This is unfortunately not the results you get from a clean checkout,

Even though you could get the 1,361 Failed result, this result is meaningful to me. Actually, we have been meeting a lot of "NYI" functions from CoreCLR.:( It means that we have to do a lot of things for stablization of CoreCLR.

Currently I'm getting the following results on Ubuntu 14.04 running on a NVidia Jetson TK1

BTW, Do you also focus on the Ubuntu/ARM32bit without the Ubuntu/ARM64bit? http://elinux.org/Jetson_TK1

CPU: NVIDIA "4-Plus-1" 2.32GHz ARM quad-core Cortex-A15 CPU with Cortex-A15 battery-saving shadow-core
GPU: NVIDIA Kepler "GK20a" GPU with 192 SM3.2 CUDA cores (upto 326 GFLOPS)
Memory: ~ 8GiB DDR3

\CC: @myungjoo , @lemmaa They also interested in the regression test of CoreCLR.

manu-st commented 8 years ago

@leemgs I'm not sure to understand what you mean by Ubuntu/ARM32bit without the Ubuntu/ARM32bit part?

leemgs commented 8 years ago

@manu-silicon Sorry, Typo, I have modified.

manu-st commented 8 years ago

@leemgs My board is the 32-bit variant, so it is just Ubuntu/ARM32bit.

https://en.wikipedia.org/wiki/Tegra#Tegra_K1

manu-st commented 8 years ago

With the fix for PR dotnet/coreclr#3981 the result is:

=======================
     Test Results
=======================
# Tests Discovered : 6019
# Passed           : 4793
# Failed           : 882
# Skipped          : 344
=======================

coreclrtests.zip

prajwal-aithal commented 8 years ago

@manu-silicon I am trying to do the same (run coreclr tests on ARM environment) by following this guide https://github.com/dotnet/coreclr/blob/master/Documentation/building/unix-test-instructions.md. According to this, building tests for ARM results in an error (as building the native code using cmake in Windows does not support arm architecture yet). I can make some changes to the buildtest script to build only the C# based tests for ARM architecture.

I wanted to ask how you are building the test binaries for ARM environment (using buildtest.cmd). Have you made some local changes (to build only the C# code and ignore the CMake build) or is there some other way to do it?

manu-st commented 8 years ago

@prajwal-aithal I've followed the instructions and build the tests on Windows and copy the tests folder on Linux ARM. I built on ARM the native code for CoreFX (just the native code as the rest won't build). I built CoreFX on Linux x64 and copied the assemblies over. I built mscorlib on Linux x64 and copied it over.

Once I had all the paths setup properly, I launched the command to run the tests.

prajwal-aithal commented 8 years ago

@manu-silicon Oh, ok. I will try this then. Thank you :)

manu-st commented 8 years ago

Today's result with FEATURE_STUBS_AS_IL disabled:

=======================
     Test Results
=======================
# Tests Discovered : 6019
# Passed           : 4836
# Failed           : 839
# Skipped          : 344
=======================

coreclrtests.zip

jyoungyun commented 8 years ago

Target: Raspberry Pi2 CoreClr commit id : ff26d6801b3ce0dec5918a5ad0d3ab90f9656e28 (around april 9)

=======================
     Test Results
=======================
# Tests Discovered : 7422
# Passed           : 6074
# Failed           : 1017
# Skipped          : 331
=======================

I'm curious why my test discovered count is different with others. CoreClr_UnitTest_Results_160412_01.zip

myungjoo commented 8 years ago

@jyoungyun Next time, please attach the result XML file as well.

manu-st commented 8 years ago

@jyoungyun The difference in the number of tests is related on how I build first the test on Windows and then copied them over to the Linux ARM device. As @myungjoo is saying, having the XML file would enable us to compare and see what the differences are.

jyoungyun commented 8 years ago

@manu-silicon I attached xml file. Thank you.

jyoungyun commented 8 years ago

Target: Raspberry Pi2 Coreclr commit id : 31fada1 + dotnet/coreclr#4460 + dotnet/coreclr#4503

==========================
      Test Results
==========================
# Tests Discovered : 7421
# Passed           : 6547
# Failed           : 543
# Skipped          : 331
==========================

CoreClr_UnitTest_Results_160427.zip

masonwheeler commented 8 years ago

Wow, so those two fixes cleared up about half of the failing tests? Nice!

jyoungyun commented 8 years ago

@masonwheeler Yes! The dotnet/coreclr#4460 patch reduced half of failing tests. :)

jyoungyun commented 8 years ago

Target: Raspberry Pi2 Coreclr commit id: 18268be + dotnet/coreclr#4503 + dotnet/coreclr#4581

==========================                                                      
      Test Results                                                              
==========================                                                      
# Tests Discovered : 6018                                                       
# Passed           : 5295                                                       
# Failed           : 395                                                        
# Skipped          : 328                                                        
==========================

And I found the reason why my tests discovered count is different with others. My test directory structure was wrong. Now it's the same with others. :)

CoreClr_UnitTest_Results_160428.zip

leemgs commented 8 years ago

@jyoungyun , It's amazing. : ) BTW, Is https://github.com/dotnet/coreclr/commit/18268beae931cbd4d110959ea53d785b193eceb1 related to the number of the reduced failure? It seems that it is not related to the failure numbers of the unit test for Linux/ARM.

manu-st commented 8 years ago

@jyoungyun If possible, it would be nice to know how many tests each bug fix addresses.

manu-st commented 8 years ago

@jyoungyun Out of curiosity, how long does it take you to run all the tests. For me on the Jetson TK1, it takes roughly half a day.

myungjoo commented 8 years ago

@manu-silicon We will be pushing out patches that accelerate the execution of corerun at ARM soon. @leemgs is preparing PR for that (release build bugfix)

leemgs commented 8 years ago

For reference, @manu-silicon has shared the hardware specification that he used for getting the result of unit test. Here is the detail contents of his development board.

http://elinux.org/Jetson_TK1 (using Quad-core Cortex-A15 CPU and 2GiB RAM).

jyoungyun commented 8 years ago

@leemgs Oh, The number(18268be) means the base commit. Someone want to reproduce the result, the base commit info. will be helpful. Reducing failure test cases are more to do with dotnet/coreclr#4460 patch, I think. @manu-silicon It takes a lot of time to run unix test on Raspberry pi2. In my case, about 12 hours is the usaual time required. But when I use a release mode binary, it takes only 4 hours! I hope that @leemgs patch will be ready soon.

hqueue commented 8 years ago

In the lastet commit(92d709193a8ee3d8da3519811912ec0f7a552993), I got much better result. I was investigating several failures and many of them are fixed in the latest master branch. Can anybody run the whole test again?

manu-st commented 8 years ago

@hqueue I can do it on Friday.

manu-st commented 8 years ago

Tried to run the test suite on Friday, but it got stuck on 2 tests that I had to manually kill:

FAILED   - JIT/Regression/CLR-x86-JIT/V2.0-Beta2/b425314/b425314/b425314.sh
               BEGIN EXECUTION
               /home/ubuntu/local/Microsoft/tests/Tests/coreoverlay/corerun b425314.exe
               ./b425314.sh: line 62: 32428 Killed                  $_DebuggerFullPath "$CORE_ROOT/corerun" b425314.exe $CLRTestExecutionArguments $Host_Args
               Expected: 100
               Actual: 137
               END EXECUTION - FAILED

FAILED   - JIT/Regression/CLR-x86-JIT/V2.0-Beta2/b426654/b426654/b426654.sh
               BEGIN EXECUTION
               /home/ubuntu/local/Microsoft/tests/Tests/coreoverlay/corerun b426654.exe
               ./b426654.sh: line 62: 10285 Killed                  $_DebuggerFullPath "$CORE_ROOT/corerun" b426654.exe $CLRTestExecutionArguments $Host_Args
               Expected: 100
               Actual: 137
               END EXECUTION - FAILED

manu-st commented 8 years ago

The run finally completed. Results on Jetson TK1 commit 3ddd9c43821bf617621f87d8d1519229e7d3b789

=======================
     Test Results
=======================
# Tests Discovered : 6019
# Passed           : 5478
# Failed           : 234
# Skipped          : 307
=======================

coreclrtests_2016_05_06.zip

jyoungyun commented 8 years ago

The latest unit test results!

Target: Raspberry Pi2 Coreclr commit id: 6b92cff6

==========================
      Test Results
==========================
# Tests Discovered : 6530
# Passed           : 5981
# Failed           : 215
# Skipped          : 334
==========================

CoreClr_UnitTest_Results_160510.zip

@manu-silicon When I run unix tests, some tc takes such a long time. And the blocked tc is changed. Have you seen the same case ?

myungjoo commented 8 years ago

Note that we have about 120 fails due to regression being hopefully fixed easily with dotnet/coreclr#4888. We may see under 100 fails soon.

manu-st commented 8 years ago

@jyoungyun I'm not sure what you mean by "tc is changed". But as I said it just before posting my results, I had 2 tests that was hanging preventing the rest of the test cases to run. I had to kill them manually to get to the completion.

jyoungyun commented 8 years ago

@manu-silicon Oh, I mean that I got stuck on different tcs when I tried to run unixtest. Sometimes b425314 was hanging and while other times the tc was not hanging instead the other tc got stuck. So I had to kill them manually every time. I wander why hanging tc is different everytime and want to know you have seen the same situation. If the stuck tc is always the same, we can modify tests/testsFailingOutsideWindows.txt to get to the completion without killing manually even though it's a temporary way.

jyoungyun commented 8 years ago

It's amazing. The failed tc is finally double digits! :+1:

Target: Raspberry Pi2 Coreclr commit id: 9eba7d3

==========================
      Test Results
==========================
# Tests Discovered : 6530
# Passed           : 6137
# Failed           : 98
# Skipped          : 295
==========================

benpye commented 8 years ago

Wow, it's amazing to see what I started last year get this far. Sorry I couldn't bring it any further but it definitely seems like ARM could be a viable target for .NET Core soon.

myungjoo commented 8 years ago

It appears that we'd even see it being around 40 today. (or less than 40!).

jyoungyun commented 8 years ago

Today's result!

Target : Raspberry Pi2 Coreclr Commt id : 1c2d8a6 tc commit id : 171f7133287ebbee24d6a7f193b13a9f959e9297

==========================
      Test Results
==========================
# Tests Discovered : 6966
# Passed           : 6626
# Failed           : 49
# Skipped          : 291
==========================

janvorli commented 8 years ago

It's great to see such progress here!

sergiy-k commented 8 years ago

Very nice! Great progress! Do you have next target once all of these Pri0 tests pass? Is it Pri1 tests (about 3000 additional tests)?

leemgs commented 8 years ago

To All,

Here is another note-taking. I have summarized the unit-test result of coreCLR with Release-build mode on Samsung ARM ChromeBook. Note that the below result was executed by Release-build mode.

Before doing the below 3 steps

=======================
     Test Results
=======================
# CoreCLR Binary Folder: /unit-test/bin.coreclr.20160517/Product/Linux.arm.Release
# Tests Discovered : 6966
# Passed           : 6282
# Failed           : 420 ★
# Skipped          : 264
=======================
* Priority 0 (default)
* Target: Samsung ARM Chromebook
* Run Time: 89m32.452s (13h in debug-build)
* CoreCLR Commit No: May-15-2016

coreclrtests.release.failur-420.20160515.zip

And then, We could get the 67 failures after executing the below 3 steps. In this case, we set the priority 666 (instead of priority 0) to increase test cases. 1) Resync all components in release binaries (without the debug-build mode) 2) Adding the all managed files (Linux, Unix, AnyOS) 3) UNW_ARM_UNWIND_METHOD Back: default ( source: https://wiki.linaro.org/KenWerner/Sandbox/libunwind#overhead_of_the_ARM_specific_unwind-tables)

After doing the above 3 steps

=======================
     Test Results
=======================
# CoreCLR Binary Folder: /unit-test/bin.coreclr.20160517/Product/Linux.arm.Release
# Tests Discovered : 9870
# Passed           : 9458
# Failed           : 67  ★
# Skipped          : 345
=======================
* Priority 666
* Target: Samsung ARM Chromebook
* Run Time: 128m55.349s (15h in debug-build)
* CoreCLR Commit No: May-15-2016

coreclrtests.chromebook.release.failure-67.20160518.zip

jyoungyun commented 8 years ago

Two days ago, results. I run unit tests including test priority 0 and 1 but the failure count is pretty much the same before. :)

Target : Raspberry Pi2 Coreclr Commt id : e78338e (Date: Tue May 17 17:19:37 2016 -0700) tc commit id : 4941764c (Date: Mon May 16 22:46:36 2016 -0700)

==========================
      Test Results (Debug)
==========================
# Tests Discovered : 9870
# Passed           : 9423
# Failed           : 102
# Skipped          : 345
==========================

mkborg commented 8 years ago

@leemgs

2) Adding the all managed files (Linux, Unix, AnyOS)

Where did you put those managed stuff? How did you run tests using those files?

leemgs commented 8 years ago

Where did you put those managed stuff?

@mkborg , I had got the managed stuffs by building coreFX source on Ubuntu 14.04 X64.

How did you run tests using those files?

Before: --coreFxBinDir="../corefx.bin/AnyOS.AnyCPU.Release"

After: --coreFxBinDir="../corefx.bin/Linux.AnyCPU.Release;../corefx.bin/Unix.AnyCPU.Release; \ ../corefx.bin/AnyOS.AnyCPU.Release"

mkborg commented 8 years ago

I will try it.

leemgs commented 8 years ago

) UNW_ARM_UNWIND_METHOD Back: default

I will try it.

@mkborg , If you can, I recommend that you also try to check the different ARM unwind methods (e.g., UNW_ARM_UNWIND_METHOD=1|2|...|255)

https://wiki.linaro.org/KenWerner/Sandbox/libunwind

myungjoo commented 8 years ago

dotnet/coreclr#5087 is tested to resolve 5 failed ARM/Linux test cases (PR0)

khionu commented 8 years ago

Just wanted to say thanks to everyone contributing their time to this issue <3

Toxantron commented 8 years ago

Hey, I just setup CoreCLR on my ARM Chromebook with TK1. Is the patch from the documentation still needed?

leemgs commented 8 years ago

@Toxantron , You can easily setup the corclr/corefx environment on your own arm chromebook by referencing the documentation of coreclr. From my experience, I recommend that you utilize available usb stick for ubuntu/arm 14.04 based dotnet core.

jyoungyun commented 8 years ago

Here is the recently unit test results on Raspberry Pi2. The CoreCLR commit is 38a89155cb2c53a9603d2234731525abdd974014 and the TC commit is 38a89155cb2c53a9603d2234731525abdd974014 with priority 666. I added the PAL test results too.

==========================
      Test Results
==========================
# CoreCLR Bin Dir  : /home/dotnet/tmp/coreclr/bin/Product/Linux.arm.Debug
# Tests Discovered : 9853
# Passed           : 9448
# Failed           : 53
# Skipped          : 352
==========================
1493 minutes and 27 seconds taken to run CoreCLR tests.

==========================
PAL Test Results:
  Passed: 807
  Failed: 1
==========================

myungjoo commented 8 years ago

We are observing about 20 less fails in different ARM devices. We will need to look at what are making such differences, too.

mkborg commented 8 years ago

My recent tests have many failures:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : /coreclr-testing/20160706/odroid-2.arm-softfp/odroid-2.coreclr_46d3809.corefx_860d28c.arm-softfp/CORECLR_NATIVE/coreclr/bin/Product/Linux.arm-softfp.Debug
# Tests Discovered : 9870
# Passed           : 9124
# Failed           : 394
# Skipped          : 352
=======================

Is it my local issue or actual regression?

@jyoungyun What is your current test results?

dotnet / runtime

ARM 32-bit progress #5460