rstudio / rstudio-launcher-plugin-sdk

The RStudio Launcher Plugin SDK is a software development kit used to create plugins that integrate orchestration tools with the RStudio Job Launcher.
Other
6 stars 3 forks source link

Plugin cannot properly drop privilege twice, causing fatal error #77

Open kfeinauer opened 7 months ago

kfeinauer commented 7 months ago

Currently, the SageMaker plugin SDK cannot properly start without a code patch to the SDK. The following is the error they get:

2022-06-28T16:29:02.737432Z [rstudio-sagemaker-launcher] INFO Received signal: 2
2022-06-28T16:29:02.737790Z [rstudio-sagemaker-launcher] INFO Stopping plugin...
2022-06-28T16:29:56.703204Z [rstudio-sagemaker-launcher] ERROR SystemError error 1 (Operation not permitted) [subcategory: system]; OCCURRED AT temporarilyDropPrivileges /local/home/jedambr/RStudioUpdate2022/src/Rstudio-launcher-plugin-sdk/third-party-src/sdk/src/system/PosixSystem.cpp:262
2022-06-28T16:29:56.703568Z [rstudio-sagemaker-launcher] ERROR Could not lower privilege to server user: sagemaker-user.

Possible causes:

  1. Early during startup, we check the scratch path and drop privileges from root -> server user. This also was how things worked before the new SDK release: https://github.com/rstudio/rstudio-launcher-plugin-sdk/blob/b0b55f44dbd09254ab9dc4afc781101d96bb3087/sdk/src/AbstractMain.cpp#L92-L94
  2. In the new version, in addition, we need to setup the file log destination with root privileges to ensure we can initially create the file under /var/log. We restore root https://github.com/rstudio/rstudio-launcher-plugin-sdk/blob/b0b55f44dbd09254ab9dc4afc781101d96bb3087/sdk/src/AbstractMain.cpp#L221-L226
  3. After we're done with setting up the log file, we again lower root privilege in the new version: https://github.com/rstudio/rstudio-launcher-plugin-sdk/blob/b0b55f44dbd09254ab9dc4afc781101d96bb3087/sdk/src/AbstractMain.cpp#L244-L250

From the SageMaker logs, it looks like maybe the call to restoreRoot is not allowing them to again call temporarilyDropPrivileges again. In the previous version, there was one restoreRoot() -> temporarilyDropPrivileges() call. In the new version, there are two.

SageMaker is working around the issue by commenting out the following code in AbstractMain.cpp: https://github.com/rstudio/rstudio-launcher-plugin-sdk/blob/6c676dd10dc7b09a0635c630454922c8f9d212a9/sdk/src/AbstractMain.cpp#L244-L250

nahara7 commented 4 months ago

So far I haven't reproduced this locally. I've tried a few configurations of user ownership to see if it would arise, but everything is working well. I did notice that the binary lives in a different user directory than the server user

ERROR SystemError error 1 (Operation not permitted) [subcategory: system]; OCCURRED AT temporarilyDropPrivileges /local/home/jedambr/RStudioUpdate2022/src/Rstudio-launcher-plugin-sdk/third-party-src/sdk/src/system/PosixSystem.cpp:262

Where sagemaker is the server user.

If that directory has group permissions for the server-user all looks well.

I can still go ahead and remove the second restoreRoot call. But to really see why this happening, we should see the user configurations and permissions of the sagemaker environment.

nahara7 commented 4 months ago

Moving to backlog since this has not been reproduced across the team

kfeinauer commented 1 month ago

AWS is silent on this and we cannot reproduce so we'll re-address this when it comes back up.