dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

x86/Linux progress #7335

Open parjong opened 7 years ago

parjong commented 7 years ago

This issue for tracking x86/Linux progress with respect to the regression tests.

Here is the current status on Ubuntu 14.04 Docker Container (on Ubuntu 16.04 x64) and full result:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : 
# Tests Discovered : 7027
# Passed           : 5592
# Failed           : 1156
# Skipped          : 279
=======================

The above result comes from 63607d8e657 with https://github.com/parjong/coreclr/tree/fix/x86_4byte_alignment and dotnet/coreclr#9261.

parjong commented 7 years ago

\CC @seanshpark @wateret

parjong commented 7 years ago

63607d8e657 (with alignment workaround) shows the following result:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 5892
# Failed           : 856
# Skipped          : 279
=======================
53 minutes and 59 seconds taken to run CoreCLR tests.

dotnet/coreclr#9121 seems to resolve GC and JIT related failures.

parjong commented 7 years ago

Here is the result from 2ecadf5d1ce (without any additional patch):

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 5874
# Failed           : 874
# Skipped          : 279
=======================
65 minutes and 29 seconds taken to run CoreCLR tests.

Recently merged dotnet/coreclr#8849 eliminates the need for alignment workaround. There is a small increase in the number of failed tests, and most of them are caused by stack smashing. Incorrect funclet prolog/epilog may cause this stack smashing issue.

janvorli commented 7 years ago

@parjong, @seanshpark, @wateret - I was trying to build and run dotnet for x86 Linux today, using the current state of the master branch, but I was unable to make it work. So I was wondering if you could share a list of steps to successfuly build and gather all parts of the dotnet core the way you do it. I have tried to build the coreclr and corefx native binaries both using the cross build and a build inside of a docker container with x86 ubuntu 14.04 (running the container on x64 Ubuntu 14.04). I have built the corefx managed assemblies on my x64 ubuntu. When I try to run a simple hello world like app inside of the docker container, I get a strange assertion in the native runtime even before the coreclr_initialize completes.

parjong commented 7 years ago

@janvorli It is a bit strange. I have used the same environment (docker container from docker image imported from x86/rootfs in Core CLR). Could you let me know the assert failure that you got?

parjong commented 7 years ago

And, I am currently using a bit old Core FX (although I am not sure whether it is relevant).

janvorli commented 7 years ago

The assert and call stack is below. I guess it has to do something with how I've collected all the files. I have created the docker image myself from vanilla Ubuntu 14.04 x86 image and installed all the dependencies, clang, etc. Could you please write down a step by step list of how to get a working environment from scratch?

Assert failure(PID 59096 [0x0000e6d8], Thread: 59096 [0xe6d8]): Consistency check failed: Illegal null pointerFAILED: ok
         FAILED: CheckPointer(pMT)
                /home/janvorli/git/coreclr/src/vm/appdomain.cpp, line: 13816
    File: /home/janvorli/git/coreclr/src/inc/check.h Line: 373
    Image: /home/janvorli/dotnet/test/corerun

#0  DBG_DebugBreak () at debugbreak.S:114
dotnet/coreclr#1  0xf7899a88 in DebugBreak () at /home/janvorli/git/coreclr/src/pal/src/debug/debug.cpp:404
dotnet/coreclr#2  0xf6ce977c in CHECK::Setup (this=0xffffbfa4, message=0xf79daf30 "Illegal null pointer", condition=0xf7a6ba26 "ok",
    file=0xf79daf45 "/home/janvorli/git/coreclr/src/inc/check.h", line=373) at /home/janvorli/git/coreclr/src/utilcode/check.cpp:218
dotnet/coreclr#3  0xf6e3e308 in CheckPointer<MethodTable> (o=0x0, ok=NULL_NOT_OK) at /home/janvorli/git/coreclr/src/inc/check.h:373
dotnet/coreclr#4  0xf7226147 in BaseDomain::LookupType (this=0xf7c9f940 <g_pSharedDomainMemory>, id=60) at /home/janvorli/git/coreclr/src/vm/appdomain.cpp:13816
dotnet/coreclr#5  0xf72260f3 in BaseDomain::LookupType (this=0x8077208, id=60) at /home/janvorli/git/coreclr/src/vm/appdomain.cpp:13813
dotnet/coreclr#6  0xf6f4d993 in VSD_ResolveWorker (pTransitionBlock=0xffffc324, siteAddrForRegisterIndirect=354570792, token=3979264)
    at /home/janvorli/git/coreclr/src/vm/virtualcallstub.cpp:1579
dotnet/coreclr#7  0xf71f349a in ResolveWorkerAsmStub () at asmhelpers.S:1041
dotnet/coreclr#8  0xffffc324 in ?? ()
dotnet/coreclr#9  0xf4fd0222 in ?? ()
dotnet/coreclr#10 0xf633b14c in ?? ()
dotnet/coreclr#11 0xf71f31ab in CallDescrWorkerInternal () at asmhelpers.S:442
dotnet/coreclr#12 0xf6f85d08 in CallDescrWorker (pCallDescrData=0xffffcba8) at /home/janvorli/git/coreclr/src/vm/callhelpers.cpp:146
dotnet/coreclr#13 0xf6f85ab9 in CallDescrWorkerWithHandler (pCallDescrData=0xffffcba8, fCriticalCall=0) at /home/janvorli/git/coreclr/src/vm/callhelpers.cpp:89
dotnet/coreclr#14 0xf6f87a2b in MethodDescCallSite::CallTargetWorker (this=0xffffcdd8, pArguments=0xffffcf00, pReturnValue=0xffffcc18, cbReturnValue=8)
    at /home/janvorli/git/coreclr/src/vm/callhelpers.cpp:656
dotnet/coreclr#15 0xf6d56064 in MethodDescCallSite::Call_RetOBJECTREF (this=0xffffcdd8, pArguments=0xffffcf00) at /home/janvorli/git/coreclr/src/vm/callhelpers.h:436
dotnet/coreclr#16 0xf7205fd4 in AppDomain::DoSetup (this=0x8077208, setupInfo=0xffffd2f8) at /home/janvorli/git/coreclr/src/vm/appdomain.cpp:5735
dotnet/coreclr#17 0xf6d4a3cc in CorHost2::_CreateAppDomain (this=0x805e8d8, wszFriendlyName=0x805e910 u"unixcorerun", dwFlags=336, wszAppDomainManagerAssemblyName=0x0,
    wszAppDomainManagerTypeName=0x0, nProperties=5, pPropertyNames=0x805e938, pPropertyValues=0x805e958, pAppDomainID=0xffffd650)
    at /home/janvorli/git/coreclr/src/vm/corhost.cpp:1717
dotnet/coreclr#18 0xf6d4d9ac in CorHost2::CreateAppDomainWithManager (this=0x805e8d8, wszFriendlyName=0x805e910 u"unixcorerun", dwFlags=336,
    wszAppDomainManagerAssemblyName=0x0, wszAppDomainManagerTypeName=0x0, nProperties=5, pPropertyNames=0x805e938, pPropertyValues=0x805e958,
    pAppDomainID=0xffffd650) at /home/janvorli/git/coreclr/src/vm/corhost.cpp:1890
dotnet/coreclr#19 0xf6cd1623 in coreclr_initialize (exePath=0x8052014 "/home/janvorli/dotnet/test/corerun", appDomainFriendlyName=0x804e529 "unixcorerun", propertyCount=5,
    propertyKeys=0xffffd694, propertyValues=0xffffd680, hostHandle=0xffffd654, domainId=0xffffd650)
    at /home/janvorli/git/coreclr/src/dlls/mscoree/unixinterface.cpp:219
dotnet/coreclr#20 0x0804c51f in ExecuteManagedAssembly (currentExeAbsolutePath=0x8052014 "/home/janvorli/dotnet/test/corerun",
    clrFilesAbsolutePath=0x8052084 "/home/janvorli/dotnet/test", managedAssemblyAbsolutePath=0x805204c "/home/janvorli/dotnet/test/nullref.exe",
    managedAssemblyArgc=0, managedAssemblyArgv=0x0) at /home/janvorli/git/coreclr/src/coreclr/hosts/unixcoreruncommon/coreruncommon.cpp:404
dotnet/coreclr#21 0x0804b228 in corerun (argc=2, argv=0xffffd864) at /home/janvorli/git/coreclr/src/coreclr/hosts/unixcorerun/corerun.cpp:149
dotnet/coreclr#22 0x0804b35a in main (argc=2, argv=0xffffd864) at /home/janvorli/git/coreclr/src/coreclr/hosts/unixcorerun/corerun.cpp:161
parjong commented 7 years ago

I'll first check whether master works for me.

parjong commented 7 years ago

To collect Core FX managed dll(s), I used a bit old collecting script of the following form (OS is Linux):

MANAGED_TAGS=()
MANAGED_TAGS+=("AnyOS.AnyCPU")
MANAGED_TAGS+=("Unix.AnyCPU")
MANAGED_TAGS+=("${OS}.AnyCPU")

    for MANAGED_TAG in ${MANAGED_TAGS[@]}; do
      REPO="${SRC_DIR}/${MANAGED_TAG}.${PRESET}"

      for BASE in $(find "${REPO}"  -iname '*.dll' \! -iwholename '*test*' \! -iwholename '*/ToolRuntime/*' \! -iwholename '*/RemoteExecutorConsoleApp/*' \! -iwholename '*/net*' \! -iwholename '*aot*' -exec dirname {} \; | uniq | xargs -i basename {}); do
        PDB_FILE="${REPO}/${BASE}/${BASE}.pdb"
        DLL_FILE="${REPO}/${BASE}/${BASE}.dll"

        if [[ -f "${DLL_FILE}" ]]; then
          cp -t "${MANAGED_BIN_FILE_INTO}" "${DLL_FILE}"
        fi
      done
    done
parjong commented 7 years ago

Here is the brief steps that I am currently using:

janvorli commented 7 years ago

@parjong thank you. These match the steps I have done. I guess I'll try to create the docker image from the rootfs as you've said you did to see if it makes any difference.

parjong commented 7 years ago

@janvorli Please let me know if there is any problem. I checked the current tip (7f3a87ae63b88327a3dc2b830d52f49a480509e0) and it works for me.

janvorli commented 7 years ago

@parjong it is weird. I have just deleted the whole bin folder in coreclr, rebuilt the sources one more time and now it works. I am sorry for wasting your time. Btw, I have not specified the "-DSKIP_LLDBPLUGIN=true" and the libsosplugin.so was also built fine.

parjong commented 7 years ago

@janvorli Thanks you for check :+1:

FYI, -DSKIP_LLDBPLUGIN=true was just a workaround during bring up, but remains unchanged as lldb-plugin is not used currently.

parjong commented 7 years ago

Here is the result (XML) from b957f8c3e3b:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : 
# Tests Discovered : 7027
# Passed           : 5913
# Failed           : 833
# Skipped          : 281
=======================
parjong commented 7 years ago

dotnet/coreclr#9601 (although it is under review) seems to make huge progress.

Here is the result XML from b957f8c3e3b with dotnet/coreclr#9601:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 6526
# Failed           : 220
# Skipped          : 281
=======================
parjong commented 7 years ago

Here is the result from 6092f90e5a0 (log and XML):

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 6614
# Failed           : 131
# Skipped          : 282
=======================

dotnet/coreclr#9601 seems to make huge progress (more than expected)!!!

parjong commented 7 years ago

dc3626d4e69 finally achieves < 100 failures:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 6657
# Failed           : 88
# Skipped          : 282
=======================

Here are log and XML.

janvorli commented 7 years ago

@parjong great! Thank you for the update. CC: @gkhanna79

parjong commented 7 years ago

Here is the recent result (XML) from cf7d6d92484:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7027
# Passed           : 6707
# Failed           : 51
# Skipped          : 269
=======================
BruceForstall commented 7 years ago

@parjong How's it look now?

Should we create a Linux/x86 GitHub project (https://github.com/dotnet/coreclr/projects)? It's a relatively new GitHub feature -- not sure how useful it really is.

parjong commented 7 years ago

Here is the result from 2401b6ed082 (full log):

=======================
     Test Results
=======================
# CoreCLR Bin Dir  : 
# Tests Discovered : 7061
# Passed           : 6319
# Failed           : 18
# Skipped          : 724
=======================

dotnet/coreclr#10538 resolves this failure, but is not merged in 2401b6ed082 :

dotnet/coreclr#10188 addresses the following two failures :

dotnet/coreclr#10410 seems to address the following two failures:

The following failures seems to be related with incorrect stack unwinding on esp-frame dotnet/coreclr#10025, or helper-frame dotnet/coreclr#9272). I hope that dotnet/coreclr#10012 addresses these failures:

dotnet/coreclr#10139 is related with these two failures:

The following failure seems to be related with some GC issue, but not sure yet:

parjong commented 7 years ago

Here is the result from 1c2ee08a4bc:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7061
# Passed           : 6320
# Failed           : 17
# Skipped          : 724
=======================

As expected, Loader.classloader.generics.Instantiation.Recursion.genrecur.genrecur failure is gone:

Failed test:
  JIT.IL_Conformance.Old.Conformance_Base.conv_ovf_r8_i.conv_ovf_r8_i
  JIT.IL_Conformance.Old.Conformance_Base.conv_ovf_r8_i4.conv_ovf_r8_i4
  JIT.Methodical.Arrays.misc._il_relinitializearray._il_relinitializearray
  JIT.Methodical.eh.nested.nonlocalexit.throwinfinallyrecursive_20_d.throwinfinallyrecursive_20_d
  JIT.Methodical.eh.nested.nonlocalexit.throwinfinallyrecursive_20_r.throwinfinallyrecursive_20_r
  JIT.Performance.CodeQuality.Serialization.Deserialize.Deserialize
  JIT.Performance.CodeQuality.Serialization.Serialize.Serialize
  JIT.Performance.CodeQuality.Span.SpanBench.SpanBench
  JIT.Regression.CLR-x86-JIT.V1-M12-Beta2.b52578.b52578.b52578
  JIT.Regression.CLR-x86-JIT.V1-M12-Beta2.b52840.b52840.b52840
  JIT.Regression.CLR-x86-JIT.V1.1-M1-Beta1.b143840.b143840.b143840
  JIT.Regression.VS-ia64-JIT.V1.2-M01.b12390.b12390.b12390
  JIT.jit64.rtchecks.overflow.overflow01_div.overflow01_div
  JIT.jit64.rtchecks.overflow.overflow02_div.overflow02_div
  JIT.jit64.rtchecks.overflow.overflow04_div.overflow04_div
  readytorun.mainv1.mainv1
  readytorun.mainv2.mainv2
parjong commented 7 years ago

fa7293aa828 finally resolves most of unittest failures except 2 readytorun tests:

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7068
# Passed           : 6343
# Failed           : 2
# Skipped          : 723
=======================

Several tests are excluded from the above result.

4 tests based on Windows-specific struct layout rule (#10340)

4 tests related with tailcall optimization:

1 test that takes too much time (about 4 hour?):

4 tests incompatible with remote testing:

1 test that was disabled due to hang before (now it works, but I forgot to enable it)

parjong commented 7 years ago

bece89ead08 finally shows 0 failed count (although some tests are excluded)

=======================
     Test Results
=======================
# CoreCLR Bin Dir  :
# Tests Discovered : 7068
# Passed           : 6345
# Failed           : 0
# Skipped          : 723
=======================
janvorli commented 7 years ago

@parjong congratulations for the great progress! Maybe it is time to start running the Pri 1 tests too (that would add about 3000 more tests).

parjong commented 7 years ago

@janvorli We already did. Here is the result from a762db4d403 (full log):

=======================
     Test Results
=======================
# Tests Discovered : 11346
# Passed           : 10528
# Failed           : 7
# Skipped          : 811
=======================

Here is the list of failed tests:

dotnet/coreclr#10340 seems to cause the following failures:

dotnet/coreclr#10888 is expected to resolve the following failures:

GC.Stress.Framework.ReliabilityFramework.ReliabilityFramework is a bit new failure that we need to analyze

janvorli commented 7 years ago

@parjong Awesome, thank you! CC: @gkhanna79, @Petermarcu

swgillespie commented 7 years ago

@parjong The reliability framework is a test that was just re-enabled recently - feel free to ping me sometime with the failure message and I'd be happy to help investigate it if I can. https://github.com/dotnet/coreclr/pull/11029

parjong commented 7 years ago

@swgillespie Thanks you for comment. GC.Stress.Framework.ReliabilityFramework.ReliabilityFramework failed from 04/20. Both x86 and armel have the same failure. I'm not sure about armhf as I do NOT have a result, yet.

Petermarcu commented 7 years ago

@parjong Awesome progress!

parjong commented 7 years ago

@gkhanna79 This issue is just for discussion and progress tracking. Could you please change the milestone, or should I close this one?

gkhanna79 commented 7 years ago

I have changed the milestone - please continue to use this for discussion.

ryukinix commented 7 years ago

Awesome. This may be released with 2.0.0 .NET Core?

BruceForstall commented 7 years ago

@parjong @seanshpark and others: I haven't seen any Linux/x86 activity lately. Are people still working on this or using this? Should I still try to find time to review https://github.com/dotnet/coreclr/pull/10034 (Enable FEATURE_FIXED_OUT_ARGS), for example?

parjong commented 7 years ago

Yes, we are using x86/Linux CLR, but we are currently not working on dotnet/coreclr#10034.

We have tested CLR daily, and there have been no failed tests except tests related with dotnet/coreclr#10340 (for Debug/Checked/Release) from May.

We also have tested FX, and x86/Linux and x64/Linux are almost same (except some CompilerService tests related with dotnet/coreclr#10340) for 2.0.0 branch, but we recently have some more failures on x86/Linux for master branch.

TheLastRar commented 7 years ago

Are there any plans to start doing x86/Linux daily builds?

seanshpark commented 7 years ago

@TheLastRar , could https://github.com/dotnet/coreclr/pull/12897 be the one ?

realityexists commented 6 years ago

So is it possible to download an x86 Linux build somewhere? And how stable/safe to use is it?

parjong commented 6 years ago

@realityexists x86/Linux CLR is not released yet, and thus I think that you need to build it by yourself. It is also hard to say about its safety, but we haven't encountered a blocking issue, yet.

philippweidhas commented 6 years ago

@parjong to build x86 Linux, wich of the github projects i have to clone? Is it the normal https://github.com/dotnet/coreclr or a other project? Im not sure where to begin. Thanks for your help

ryukinix commented 6 years ago

I have interest on this too. Would be nice a guide how to build dotnet on Linux.

parjong commented 6 years ago

@philippweidhas @ryukinix https://github.com/dotnet/coreclr/issues/13192#issuecomment-320188913 may be helpful.

You also need to build Core FX (https://github.com/dotnet/corefx) to run an C# application. https://github.com/dotnet/coreclr/issues/9265#issuecomment-280519593 and https://github.com/dotnet/coreclr/issues/9265#issuecomment-280521257 may be helpful.

Unfortunately, dotnet is not supported, yet. You may use corerun (which this repo provides) instead.

philippweidhas commented 6 years ago

@parjong @ryukinix i could succesfully build coreclr for x86 but im struggling to build Core FX with:

sudo apt-get install debootstrap sudo apt-get install qemu-user-static sudo ./cross/build-rootfs.sh x86 sudo apt-get install cmake sudo apt-get install clang-3.8 lldb-3.8 ./build.sh cross x86 skipnuget debug cmakeargs "-DSKIP_LLDBPLUGIN=true" clang3.8

do i have to build the CoreFX project on another way?

seanshpark commented 6 years ago

but im struggling to build Core FX with

Could you paste what the problem(error) is? Or it would be better to add a new issue and talk at there.

seanshpark commented 6 years ago

This is how I've checked with latest master as of writing this;

# build native codes for host (x64) first
./build-native.sh -debug -- clang3.8

# and then build native codes for x86-32 
./build-native.sh -debug -buildArch=x86 -- cross clang3.8

# build managed codes but not the tests
./build-managed.sh -BuildTests=false

These commands are from some time ago and not sure it's the latest.

ryukinix commented 6 years ago

Or it would be better to add a new issue and talk at there.

I think is better, since this issue thread is only about the development progress of x86.

philippweidhas commented 6 years ago

@parjong @seanshpark @ryukinix Hi guys i opened an new Issue for the x86 build for CoreFX in the CoreFX repository, i think that is the better place to discuss this problem.

dmitriyse commented 6 years ago

Most of linux distributives dropping x86 support. Please consider to support x32 ABI runtime (only for linux hosts). See https://en.wikipedia.org/wiki/X32_ABI. Ubuntu 16.04 already supports it. (apt install libc6-x32)

Windows 10 looks like have no plans to kill 32 bit version, but at the same time have no any plans to support x32 ABI.

ryukinix commented 6 years ago

Most of linux distributives dropping x86 support

This is quite true, unfortunately most of the distributions (like Arch Linux) is indeed dropping x86 support. Debian probably will continue supporting that, but I'm not so sure about the others. -- As lembranças não são só do passado. Podem ser de agora e até de amanhã. — Serial Experiments Lain Manoel Vilela, Discente em Engenharia da Computação, Universidade Federal do Ceará.

On Wed, Nov 15, 2017 at 1:24 PM, dmitriyse notifications@github.com wrote:

Most of linux distributives dropping x86 support. Please consider to support x32 ABI runtime (only for linux hosts). See https://en.wikipedia.org/wiki/X32_ABI. Ubuntu 16.04 already supports it. (apt install libc6-x32)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.