dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.06k stars 4.69k forks source link

Port corehost to QNX7 #33374

Open guesshe opened 4 years ago

guesshe commented 4 years ago

Hi,

I am trying to port the entire runtime to qnx7 platform on x64 arch. I am able to build coreclr but it won't run unless I have dotnet executable built. Any suggestions on how to build corehost for qnx?

guesshe commented 4 years ago

I am not sure if I understand it correctly. I did built src/installer/corehost/ project which contains dotnet executable binary and I can run it to load hello_world.dll (which failed at the same point as using corerun). Do you mind if we have a quick chat offline on this topic? Via zoom or something like that?

Regards

River He

On Fri., Apr. 24, 2020, 17:26 Tomas Weinfurt, notifications@github.com wrote:

yes, environment. I'm not quite sure what you mean by the previous post. In order to build assemblies you need to have working dotnet cli and c# compiler is written (mostly) in c#. forerun cannot function without System.Private.CoreLib.dll (and perhaps others), so the question is how did you get one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/33374#issuecomment-619245861, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCJEHRW4GDX67KNMBV37SDROH7ZPANCNFSM4LEKE3NA .

wfurt commented 4 years ago

sure, ping me with details: tweinfurt at yahoo. I don't think your test is valid. You can try the steps on Linux (or other supported platform)

guesshe commented 4 years ago

@wfurt @janvorli We are trying to debug this 0x80004005 error and following is the trace output. It looks like it failed to load System.Private.CoreLib.dll. The trace is trimmed and formatted to a way that is easier to read. Is System.Private.CoreLib.dll a mandatory to have in order to run a empty main function? My hello_world app only have one line " static void Main(string[] args) {}". Starting corhost.cpp - CorRuntimeHostBase::Start ceemain.cpp - EnsureEEStarted - g_fEEShutDown==0 ceemain.cpp - EEStartup - InitializeClrNotifications - status==0000000000 ceemain.cpp - EEStartup - InitializeJITNotificationTable - status==0000000000 ceemain.cpp - EEStartup - Initialize - status==0000000000 ceemain.cpp - EEStartupHelper - start ceemain.cpp - EEConfig::Setup - start ceemain.cpp - EEConfig::Setup - done ceemain.cpp - InitializeStratupFlags - done
ceemain.cpp - PAL_SetShutdownCallback - done ceemain.cpp - InitializeLogging - done ceemain.cpp - EnsureRtlFunctions - done ceemain.cpp - g_pConfig->sync - done ceemain.cpp - InitializeSpinConstants - done ceemain.cpp - InitializeStubManagers - done ceemain.cpp - Stubs - done ceemain.cpp - Inits - done rcthread.cpp - DebuggerRCTthread started m_thread!=NULL, hr==0000000000 rcthread.cpp - Thread created: hr==0000000000 rcthread.cpp - Done: hr==0000000000 ceemain.cpp - InitializeDebugger - done ceemain.cpp - Profiling service - hr==0000000000 - done ceemain.cpp - InitPreStubManager - done ceemain.cpp - g_pGCHeap->Initialize - hr==0000000000 - done ceemain.cpp - SystemDomain debugging - done ceemain.cpp - MethodDesc::Init - start ceemain.cpp - MethodDesc::Init - done ceemain.cpp - SD Init - start appdomain.cpp - Init - start appdomain.cpp - LOG - done appdomain.cpp - ZapDisable - done appdomain.cpp - GetInternalSystemDirectory - hr==0x8007007a - done appdomain.cpp - GetInternalSystemDirectory(buffer) - hr==0x8007007a - done appdomain.cpp - LoadBaseSystemClasses - start appdomain.cpp - LoadBaseSystemClasses - start appdomain.cpp - ETWOnStartup - done appdomain.cpp - OpenSystem - start pefile.cpp - OpenSystem - start pefile.cpp - DoOpenSystem - start pefile.cpp - ETWOnStartup - start pefile.cpp - ETWOnStartup - done pefile.cpp - BindToSystem - start appdomain.hpp - SystemDirectory is / coreclrbindercommon.cpp - AssemblyBinder::BindToSystem - start assemblybinder.cpp - GetAssembly - sCoreLib==/home/qnxuser/ - start assemblybinder.cpp - AssemblyBinder::GetAssembly - start Assembly path is / coreassemblyspec.cpp - BinderAcquirePEimage - start coreassemblyspec.cpp - OpenImage - start coreassemblyspec.cpp - TryOpenFile - start peimage.cpp - TryOpenFile - m_path==/home/qnxuser/System.Private.CoreLib.dll coreassemblyspec.cpp - TryOpenFile - done - hr==0x80070002 AssemblyBinder::BindToSystem - done - hr==0x80070002 ceemain.cpp - CATCH - done ceemain.cpp - if !FAILED - hr==0000000000 - done ceemain.cpp - EEStartup - EEStartupHelper - status==0x80004005 ceemain.cpp:327 - g_EEStartupStatus==0x80004005 corhost.cpp - Done - hr==0x80004005 Start: 0x80004005 coreclr_initialize failed - status: 0x80004005

janvorli commented 4 years ago

The error 0x80070002 means "File not found". Is it possible that there is some access problem to the /home/qnxuser/System.Private.CoreLib.dll?

janvorli commented 4 years ago

Btw, error codes starting with 0x8007 represent windows error codes. The lowest 16 bits of the code contain a windows error code. These windows error codes are described here: https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-

guesshe commented 4 years ago

@janvorli I don't have this managed library built. I only have libcoreclr.so. Based on previous posts in this thread, I had a feeling I don't need managed libraries to test basic PAL functionalities. Following is quoted from previous posts.

"the first thing that was done was to pass all platform abstraction layer (PAL) tests, which excercise the CRT functions used by the runtime: https://github.com/dotnet/runtime/blob/59be94b69845ecfbd5a694483c2a4853e99cc64b/docs/workflow/testing/coreclr/unix-test-instructions.md#pal-tests

and then run a simple hello world app using corerun (a basic host that complies with the runtime): https://github.com/dotnet/runtime/blob/7d67d17a9f49ad5f365467fcd3bf0d25f2b9349a/docs/workflow/building/coreclr/linux-instructions.md

iff we get this far, then run the coreclr tests, see src/coreclr/build-test.sh"

I tried a Linux version of corerun and libcoreclr.so, it doesn't give me an error looking for System.Private.CoreLib.dll. Did I misunderstand something in the instructions above?

janvorli commented 4 years ago

The part that tests the PAL is the pal test suite that you've ran before. That's the only part of the testing that doesn't run managed code. The corerun is a tool to run managed applications. So it requires System.Private.CoreLib.dll and other managed assemblies (depending on what your hello world managed app needs). I assume the Linux version didn't fail because the System.Private.CoreLib.dll is present.

guesshe commented 4 years ago

@janvorli I don't recall I put the System.Private.CoreLib.dll in the same directory as libcoreclr.so, maybe it also searches for other locations? May I use a Linux-version of System.Private.CoreLib.dll to see if it works? If not, how can I build a QNX-version of System.Private.CoreLib.dll?

janvorli commented 4 years ago

Yes, you can use the Linux version, it should just work (provided it is built from exactly the same state of the source tree as the libcoreclr.so that you've built for QNX and it is the same build flavor - you cannot combine Release build of libcoreclr.so with Debug or Checked build of System.Private.CoreLib.dll and vice versa).

guesshe commented 4 years ago

@janvorli Thanks! I will give it a try. The same state you mean it should come out of the same commit? Or similar? What errors it could give if they are from different commit? I would prefer to actually build it for QNX but it doesn't seem to support cross-compiling. I might have to upload the source code to QNX directly and run the build from there.

janvorli commented 4 years ago

I mean the same commit. There are shared data structures between libcoreclr.so and System.Private.CoreLib.dll, so any change in the layout of those structures would break things. Trying to use commits close to each other might work, but it is not worth the possible problems investigation.

wfurt commented 4 years ago

also debug/release needs to match, right? (at least is did in the past that release System.Private.CoreLib.dll did not work with debug coreclr)

guesshe commented 4 years ago

@janvorli @wfurt Thanks! I will try it out and let you know the result.

janvorli commented 4 years ago

also debug/release needs to match, right?

Yes, I've mentioned that in a comment above.

guesshe commented 4 years ago

@janvorli It seems we still have issue with Linux-version of System.Private.CoreLib.dll, any idea what does this error mean? The new error is that the PE Image file is not in native machine format.

janvorli commented 4 years ago

Can you please set the following env variables and try again? This should let the runtime load only the IL code from the System.Private.CoreLib.dll and not the already precompiled binary code that is likely causing the trouble.

COMPlus_ZapDisable=1
COMPlus_ReadyToRun=0
janvorli commented 4 years ago

@quesshe, it was discovered that the COMPlus_ZapDisable handling was accidentally disabled for some time and fixed four days ago in #35741. I'm not sure what state of the repository you are using, but you'll likely need that fix to be able to load the System.Private.CoreLib.dll built on Linux. You can easily port that change to any state of the repository as it just removes an #ifdef around getting the option related to that env variable.

guesshe commented 4 years ago

Thanks! I will do that and try it out.

On Wed, May 6, 2020 at 4:31 AM Jan Vorlicek notifications@github.com wrote:

@quesshe, it was discovered that the COMPlus_ZapDisable handling was accidentally disabled for some time and fixed four days ago in #35741 https://github.com/dotnet/runtime/pull/35741. I'm not sure what state of the repository you are using, but you'll likely need that fix to be able to load the System.Private.CoreLib.dll built on Linux. You can easily port that change to any state of the repository as it just removes an #ifdef around getting the option related to that env variable.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/runtime/issues/33374#issuecomment-624513550, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCJEHQST5UX3GDAC6JIR5LRQEN6DANCNFSM4LEKE3NA .

--

RIVER HE

Cell: +1 613 608 1686

am11 commented 4 years ago

/x86_64/usr/bin/x86_64-pc-nto-qnx7.0.0-ld: ../../../pal/src/libcoreclrpal.a(context2.S.o): relocation R_X86_64_PC32 against symbol `CONTEXT_CaptureContext' can not be used when making a shared object; recompile with -fPIC

I was also getting this error when compiling coreclr's superpmi project with illumos sysroot on Ubuntu 18.04. I was using gcc v8.4.0 and binutils v2.25.1, both built for illumos target. The fix was to upgrade binutils to v2.33.1, without code modifications in coreclr. It was due to an upstream bug in binutils's assembler (as) or archiver (ar), which was fixed around v2.29-v2.30.

karthikshanmugam commented 4 years ago

@guesshe Can you please tell me if you get the corehost to work?