gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support
https://grapheneproject.io
GNU Lesser General Public License v3.0
765 stars 262 forks source link

Graphene: stress-ng chmod fails when the number of stressors gradually increased #2451

Open anjalirai-intel opened 3 years ago

anjalirai-intel commented 3 years ago

Description of the problem

As no of stressors increased per CPU, chmod calls fails with graphene-direct

Run-1:

intel@intel-Ice-Lake-Client-Platform:~/Anjali/stressng_git/example-jobs$ graphene-direct stress-ng --chmod 1 --timeout 60s
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
error: Forwarding host environment variables to the app is enabled. Graphene will continue application execution, but this configuration must not be used in production!
stress-ng: info:  [1] dispatching hogs: 1 chmod
stress-ng: error: [1] glob on regex "/sys/devices/system/cpu/cpu0/cache/index[0-9]*" failed: 1
stress-ng: info:  [1] cache allocate: using built-in defaults as unable to determine cache details
stress-ng: info:  [1] successful run completed in 60.07s (1 min, 0.07 secs)

Run-2:

intel@intel-Ice-Lake-Client-Platform:~/Anjali/stressng_git/example-jobs$ graphene-direct stress-ng --chmod 3 --timeout 60s
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
error: Forwarding host environment variables to the app is enabled. Graphene will continue application execution, but this configuration must not be used in production!
stress-ng: info:  [1] dispatching hogs: 3 chmod
stress-ng: error: [1] glob on regex "/sys/devices/system/cpu/cpu0/cache/index[0-9]*" failed: 1
stress-ng: info:  [1] cache allocate: using built-in defaults as unable to determine cache details
stress-ng: info:  [1] successful run completed in 60.07s (1 min, 0.07 secs)

Run-3:

intel@intel-Ice-Lake-Client-Platform:~/Anjali/stressng_git/example-jobs$ graphene-direct stress-ng --chmod 4 --timeout 60s
error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
error: Forwarding host environment variables to the app is enabled. Graphene will continue application execution, but this configuration must not be used in production!
stress-ng: info:  [1] dispatching hogs: 4 chmod
stress-ng: error: [1] glob on regex "/sys/devices/system/cpu/cpu0/cache/index[0-9]*" failed: 1
stress-ng: info:  [1] cache allocate: using built-in defaults as unable to determine cache details
**stress-ng: fail:  [3] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)
stress-ng: fail:  [5] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)
stress-ng: fail:  [4] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)**
stress-ng: info:  [1] successful run completed in 60.08s (1 min, 0.08 secs)

Run-4:

intel@intel-Ice-Lake-Client-Platform:~/Anjali/stressng_git/example-jobs$ graphene-direct stress-ng --chmod 5 --timeout 60s

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
error: Forwarding host environment variables to the app is enabled. Graphene will continue application execution, but this configuration must not be used in production!
stress-ng: info:  [1] dispatching hogs: 5 chmod
stress-ng: error: [1] glob on regex "/sys/devices/system/cpu/cpu0/cache/index[0-9]*" failed: 1
stress-ng: info:  [1] cache allocate: using built-in defaults as unable to determine cache details
**stress-ng: fail:  [4] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)**
stress-ng: info:  [1] successful run completed in 60.09s (1 min, 0.09 secs)
intel@intel-Ice-Lake-Client-Platform:~/Anjali/stressng_git/example-jobs$
dimakuv commented 3 years ago

@anjalirx-intel @jinengandhi-intel Have you re-tried this test again after we agreed to update Stress-ng version? See discussion in https://github.com/oscarlab/graphene/issues/2419#issuecomment-878004114.

jinengandhi-intel commented 3 years ago

Dmitrii, we had agreed that the 2419 issue could be related to the stress-ng version, wasn't asked to test the chmod test. For latest version of stress-ng we need to try out on an Ubuntu 20.04, which we haven't been able to.

dimakuv commented 3 years ago

My assumption is that the new stress-ng version may fix this issue (since it looks related to the root cause of #2419). Anyway, assigning P1 for now and hope this will be trivially resolved when moving to the new stress-ng version.

anjalirai-intel commented 3 years ago

Verified on below config as said. Issue still exists.

Error: graphene-direct stress-ng --chmod 3 --timeout 60s

error: Using insecure argv source. Graphene will continue application execution, but this configuration must not be used in production!
error: Forwarding host environment variables to the app is enabled. Graphene will continue application execution, but this configuration must not be used in production!
[P372046:T1:] error: 'libos.entrypoint' is now a Graphene path, not URI. Ignoring the 'file:' prefix.
stress-ng: info:  [1] dispatching hogs: 3 chmod
stress-ng: fail:  [3] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)
stress-ng: info:  [1] successful run completed in 60.05s (1 min, 0.05 secs)

Machine Config: Kernel 5.12.0-051200-generic OS: 20.04.1 LTS (Focal Fossa) Stress-ng: version 0.11.07 (gcc 9.3, x86_64 Linux 5.12.0-051200-generic)

mkow commented 3 years ago

stress-ng: fail: [3] stress-ng-chmod: fchmod failed, errno=2 (No such file or directory)

This looks like just a missing entry in the manifest. Could you run this on "trace" debug level and find on which file it fails?

anjalirai-intel commented 3 years ago

@mkow This is not about missing file in the manifest. If you see the above observation mentioned in description, initially it worked for 1 stressor and 3 stressor as well and then it throw No such file or directory error

mkow commented 3 years ago

Hmm, but still, this info would be very helpful to see why it could fail.

anjalirai-intel commented 3 years ago

Closed the issue by mistake

anjalirai-intel commented 3 years ago

Partial Logs chmod_4_stressor_trace.zip

Complete Logs is not able to upload because file size is too high even after compressing.