Closed jinengandhi-intel closed 3 years ago
Looks like an issue for @boryspoplawski
Went back and checked the nightly results in our local CI, and no such failures were seen for our June 29 nightly i.e. until commit [LibOS] Rework IDs management (205cbe0123978b7b178c413a172b0be38658ef55) but something that started with June 30 nightly run, so most probably it could be one of the 2 commits from Borys:
[LibOS] Send PID alongside TID in tgkill IPC message … [LibOS] Do not remove IPC connection on errors …
In our internal CI, I also saw connect01 fail once
`
<system-err>error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
[P15230:T2:connect01] error: Sending IPC process-exit notification failed: -13 [P15230:T2:connect01] error: IPC pid release failed [P15230:T2:connect01] error: Illegal instruction during Graphene internal execution at 0x7f87fb4c755a (IP = +0xc55a, VMID = 15230, TID = 2) `
@jinengandhi-intel please check #2508 it should fix the issue.
Illegal instruction during Graphene internal execution
was just due to die_or_inf_loop
which does ud2
. The general problem is that IPC leader does not wait for all subprocesses to finish, but #2508 should fix the problem temporarily.
@jinengandhi-intel please format your comments better, the logs in the one above are impossible to read :/
Since the log level for the message has been changed from error to warning, I don't see it in the default setting but do see in when I change the log_level to trace, but there are no side-effects of this. As this is just a workaround, I would suggest we keep the issue open as it is a legitimate error which needs to be resolved and don't want this temp fix to mask the issue.
The actual issue is tracked in #2514
Description of the problem
For random tests we are seeing the following errors after the tests have finished and the return code for the test is 0.
<system-err>error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production! [P19048:T2:fcntl12] error: IPC pid release failed [P19048:T2:fcntl12] error: Illegal instruction during Graphene internal execution at 0x7fd5702dc55a (IP = +0xc55a, VMID = 19048, TID = 2) </system-err>
This is seen in the opensource CI as well as the Intel internal local CI but was missed as the errors are reported in system-err block which is not parsed.
https://localhost:8080/job/graphene-18.04/5903/artifact/LibOS/shim/test/ltp/ltp.xml https://localhost:8080/job/graphene-18.04/5911/artifact/LibOS/shim/test/ltp/ltp.xml
Steps to reproduce
Run fnctl12, fcntl12_64, waitpid03, sendto01 or waitpid04 LTP test in Graphene Native mode.