gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support
https://grapheneproject.io
GNU Lesser General Public License v3.0
769 stars 260 forks source link

Unable to open max numbers of open file descriptors permitted by pipe syscall #2296

Closed anjalirai-intel closed 3 years ago

anjalirai-intel commented 3 years ago

Description of the problem

Description: This test checks ability of pipe to open the maximum number of file descriptors permitted. It records file descriptors open prior to test run until EMFILE is returned and check to see that the number of pipes opened is (maxfds - 3) / 2 TestFile: pipe07

Expected results

pipe07 0 TINFO : Found 3 files open pipe07 1 TPASS : Opened 510 pipes

Actual results

pipe07 0 TINFO : Found 13 files open pipe07 1 TFAIL : /root/Graphene_Master/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/pipe/pipe07.c:89: Unable to open maxfds/2 pipes

dimakuv commented 3 years ago

the maximum number of file descriptors permitted

I didn't understand, how does the test know this maximum number?

anjalirai-intel commented 3 years ago

When the maximum number of file descriptors opened, pipe returns EMFILE, which tells us the per-process limit on the number of open file descriptors has been reached.

vijaydhanraj commented 3 years ago

@anjalirx-intel, is this still reproducible on the latest master? I am not able to repro this issue.

./pipe07                                  
pipe07      0  TINFO  :  Found 3 files open
pipe07      1  TPASS  :  Opened 510 pipes
graphene-direct ./pipe07
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  Found 3 files open
pipe07      1  TPASS  :  Opened 448 pipes
graphene-sgx ./pipe07 
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  Found 3 files open
pipe07      1  TPASS  :  Opened 448 pipes
anjalirai-intel commented 3 years ago

Hi @vijaydhanraj, i can still see this issue. Can you please tell me which config are you using, or do you have done any additional settings.

This is my config, result and commit details Git

Commit: 4cb98219d8e302055587a8952c1102415a72ba42 
[Pal] Remove unused parent pid from various structures

graphene-direct pipe07

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  Found 13 files open
pipe07      1  TFAIL  :  /home/intel/Anjali/graphene/LibOS/shim/test/ltp/ltp_src/testcases/kernel/syscalls/pipe/pipe07.c:89: Unable to open maxfds/2 pipes

cat /etc/os-release

NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
vijaydhanraj commented 3 years ago

Even I am on the same graphene commit 4cb98219d8e302055587a8952c1102415a72ba42 and using the Ubuntu 18.04. But one thing is, I am on the 4.15 kernel and using the Intel OOT driver.

anjalirai-intel commented 3 years ago

I have tested on the same config which you have mentioned, i am still facing this issue.

mkow commented 3 years ago

@anjalirx-intel maybe it's just your host limiting your Graphene process, not Graphene limiting the app?

vijaydhanraj commented 3 years ago

Tried even on the 5.11 kernel and not able to repro the issue.

uname -a                 
Linux sdp 5.11.0-051100rc5-generic #202101242134 SMP Mon Jan 25 02:37:18 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
graphene-direct ./pipe07                                                                     
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  Found 3 files open
pipe07      1  TPASS  :  Opened 448 pipes
graphene-sgx ./pipe07                                                                         
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  Found 3 files open
pipe07      1  TPASS  :  Opened 448 pipes
vijaydhanraj commented 3 years ago

Debugged this issue further with @anjalirx-intel and it looks like by default the test case works for both graphene-sgx and graphene-direct. But found that when /proc is mounted as the part of the manifest (she had this in her manifest), then the test fails.

Adding below to the manifest cases the failure.

diff --git a/LibOS/shim/test/ltp/manifest.template b/LibOS/shim/test/ltp/manifest.template
index 4c92d1c8..138c31b8 100644
--- a/LibOS/shim/test/ltp/manifest.template
+++ b/LibOS/shim/test/ltp/manifest.template
@@ -26,6 +26,10 @@ fs.mount.tmp.type = "chroot"
 fs.mount.tmp.path = "/tmp"
 fs.mount.tmp.uri = "file:/tmp"

+fs.mount.proc.type = "chroot"
+fs.mount.proc.path = "/proc"
+fs.mount.proc.uri = "file:/proc"
+
 sys.brk.max_size = "32M"
 sys.stack.size = "4M"

Although this might not be directly related to the test case, will continue to look at this and understand the root cause.

anjalirai-intel commented 3 years ago

Adding Dmitrii also @dimakuv @vijaydhanraj

The reason for mounting proc separately is to enable other tests who needs access to files present under proc directory and it is emulated in graphene. Like, you can see below test kill03 initially it fails with error to open file /proc/sys/kernel/pid_max but when proc is mounted, the testcase worked $ graphene-direct kill03

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
/home/intel/validation/oscarlab/LibOS/shim/test/ltp/src/lib/tst_test.c:1250: TINFO: Timeout per run is 0h 05m 00s
/home/intel/validation/oscarlab/LibOS/shim/test/ltp/src/lib/safe_file_ops.c:144: TBROK: Failed to open FILE '/proc/sys/kernel/pid_max' for reading at /home/intel/validation/oscarlab/LibOS/shim/test/ltp/src/lib/tst_pid.c:34: ENOENT (2)

Summary:
passed   0
failed   0
skipped  0
warnings 0

After manifest update $ graphene-direct kill03

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
TINFO: Timeout per run is 0h 05m 00s
TPASS: kill failed as expected: EINVAL (22)
TPASS: kill failed as expected: ESRCH (3)
TPASS: kill failed as expected: ESRCH (3)

Summary:
passed   3
failed   0
skipped  0
warnings 0

We exercised further and tried to remount /dev and /sys to check whether remounting it causing the issue but with /sys the testcase worked fine. However when /dev is remounted again we see a crash and no result is reported. Verified it with abort01 test as well

Earlier, there was a check under src/fs/shim_fs.c which was checking if the path is remounting again https://github.com/oscarlab/graphene/commit/0a76c12c294374366da86b72669a32996cfe2acc

if (dent != dentry_root && dent->state & DENTRY_VALID) { debug("Mount %s already exists, verify that there are no duplicate mounts in manifest\n" "(note that /proc and /dev are automatically mounted in Graphene).\n", mount_point); ret = -EEXIST; goto out_with_unlock; }

but now this check has been removed in the later commits. https://github.com/oscarlab/graphene/commit/69e5edafa58e3e98ce88991956fe8c8404f0e7f4

Was this change intentional or do we need to have the check if path is remounted again?

mkow commented 3 years ago

@anjalirx-intel What you're doing is completely unsupported and a very bad idea overall, it's good that it stopped to work :) I guess this test fails for you because it gets wrong numbers from procfs (because it's from the host, not from Graphene). I don't think that testing Graphene this way is meaningful.

vijaydhanraj commented 3 years ago

As @mkow pointed out, mounting host procfs results in the test getting an incorrect number of fds being open and ends up failing.

Some details: The test works in the following way. It gets the max fds that can be open by a process using getdtablesize() api and subtracts fds already opened by the process by reading /proc/self/fd directory. It then continues to open pipes until EMFILE error is returned and checks to see that number of pipes opened is (max fd - fd's already opened)/2. In the case where /proc is not mounted, we see /proc/self/fd returning 3 but when /proc is mounted we see this returning 13. Please see the logs below.

Working case:

graphene-direct ./pipe07                                                                     
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  **Found 3 files open**
pipe07      1  TPASS  :  Opened 448 pipes

Failure case:

graphene-direct pipe07
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
pipe07      0  TINFO  :  **Found 13 files open**
pipe07      1  TFAIL  :  /home/intel/Anjali/graphene/LibOS/shim/test/ltp/ltp_src/testcases/kernel/syscalls/pipe/pipe07.c:89: Unable to open maxfds/2 pipes

But from a graphene perspective, it hasn't opened 13 fds and ends up creating more pipes than expected [ (max fd - fd's already opened)/2] which results in the failure we see above.

So, we can close this issue as the behavior is expected.

dimakuv commented 3 years ago

Cool analysis, thanks Vijay!