jupyter-server / jupyter_server

The backend—i.e. core services, APIs, and REST endpoints—to Jupyter web applications.
https://jupyter-server.readthedocs.io
BSD 3-Clause "New" or "Revised" License
481 stars 293 forks source link

Auth Log #1160

Open JosephTLucas opened 1 year ago

JosephTLucas commented 1 year ago

Redirected from this issue originally filed with JupyterLab: https://github.com/jupyterlab/jupyterlab/issues/13679

Problem

I am a member of an AI Red Team and regularly perform offensive security tests in Jupyter environments. These may be hosted in major cloud service providers, on development machines, or on developer's personal hosts. Usually, my team has gained access to the machine through some non-Jupyter related mechanism and our post exploitation activities involve interacting with the target's Jupyter instance. Sometimes, it may be overly-permissive configuration decisions in Jupyter that enable our access.

As part of my mission to improve the overall security of the ML Operations Lifecycle, this feature request is to create more artifacts that would enable detection, response, and recovery of malicious activity. The first such artifact may be an "auth log". This log would enable incident responders and threat hunters to assess historical activity on the Jupyter server.

Proposed Solution

As far as I understand it, Jupyter access may be controlled by one shared password (not user-based authentication) and/or a token string that is exchanged for a 30-day cookie (the default). Therefore, I don't believe users are necessarily "loggable" by a unique field such as username. However, I propose logging other information that may be useful for incident response such as:

Authentication Time: When did any user present a password, token, or cookie to the server? Authentication Mechanism: Cookie, Password, or Token? Authentication Status: Was the action successful? IP Address: What was the remote IP address? User-Agent String: What was the user-agent string of the browser sending the connection request? There may be other useful fields, but this is a sample of the kind of information I propose JupyterLab captures.

I do see that there is a more complex identity model that I was not originally aware of. This may enable a better granularity of logging, but I'm not 100% clear on how it's applied in a single-user JupyterLab context.

Additional context

I'm aware that JupyerHub is the "approved solution" for multi-user interactions with the server. However, even in an ideal single-user setup, there may be a need to later review auth logs to verify that in fact that single user has been the only tenant.

I would be willing to contribute to this (although I have more experience in Python than TS). I also considered that this might be better as an extension than as core functionality. I prefer a "security features by default" model, but I'm open to suggestions/recommendations there as well.

It's also possible that there already is a good solution for this and I just need to take the "best practices" back to my organization. If so, just point me in the right direction!

welcome[bot] commented 1 year ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada: