Kotlin / kotlin-jupyter

Kotlin kernel for Jupyter/IPython
Apache License 2.0
1.12k stars 105 forks source link

Any Ideas for Addressing Security Concerns like system.getenv() ? #429

Open phodal opened 1 year ago

phodal commented 1 year ago

Hi, I used Kotlin Jupyter in my open source project, integration with Jupyter API. When the user run :

System.getenv()

user can get like API key and token. For example, if run this in Datalore will be:

{PATH=/opt/datalore/bin:/opt/python/envs/default/bin:/opt/datalore:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin, LOGSTASH_HOST=10.0.0.248, DISK_PORT=30081, APP_NAME=computation-agent, INSTANCE_ID=i-07e67084ee9f6dd15, LETS_PLOT_MAPTILES_KIND=vector_lets_plot, KERNEL_CLIENT_ENV=/opt/python/envs/kernel_client, MAX_HEAP_SIZE=384m, DUMP_DIR=/tmp/host/agent, EVALUATOR_MODE=container, PWD=/data/notebook_files, LOG_LEVEL=INFO, LANGUAGE=en_US:en, PYTHONPATH=/opt/datalore/python:/var/datalore/manager/.pip:/data/workspace_files, COMPUTATION_HOST=172.17.0.2, PLOTLY_RENDERER=plotly_mimetype, EVALUATOR_KERNEL_TYPE=jupyter, SQL_CELLS_API_PORT=30092, TMPDIR=/tmp, EVALUATOR_LANGUAGE=kotlin, DL_PACKAGE_MANAGER=pip, DEBIAN_FRONTEND=noninteractive, LC_ALL=en_US.UTF-8, LOGSTASH_PORT=30082, EVALUATOR_LOG_LEVEL=INFO, KOTLIN_KERNEL_SELF_CONTAINED_OUTPUTS=true, INSTANCE_TYPE=t2.medium, SHLVL=2, ANACONDA_SOURCE=/mnt/local/anaconda3, WORKBOOK_WORKING_DIR=/data/notebook_files, AGENT_SESSION_TOKEN_PATH=/data/session_token, VAR_DIR=/var/datalore, SQL_CELLS_API_HOST=10.0.0.248, AGENT_MANAGER_PORT=30090, DATALORE_USER=datalore, PYTHON_ENV=/opt/python/envs/default, LANG=en_US.UTF-8, HOST_NAME=ip-10-0-204-162, DATALOREHOME=/opt/datalore, =/opt/python/envs/kernel_client/bin/python3, KOTLIN_JUPYTER_JAVA_OPTS=, LETS_PLOT_HTML_ISOLATED_FRAME=true, DATA_ROOT=remote, DISK_HOST=disk.private.datalore.io, AGENT_RUN_TYPE=ENV, AGENT_JARS_DIR=/opt/datalore/agent, HOSTNAME=ip-10-0-204-162, AGENT_MANAGER_HOST=10.0.0.248, CHECK_ACTIVITY=true, JUPYTER_DATA_DIR=/opt/python/envs/default/share/jupyter, CONFIG_DIR=/etc/datalore, COMPUTATION_PORT=39169, HOME=/home/datalore}

I implemented a basic hook before making a request and after receiving a response. I'm now exploring ways to enhance this solution. Any suggestions?

ileasile commented 1 year ago

Hi! Yes, it's a valid concern. I think that the best solution is to use a safer version of JVM that doesn't execute requests to these vulnerable system methods. Another solution is response postprocessor that analyzes stream and display_data responses and removes sensitive data from them based on some heuristics

ileasile commented 1 year ago

However the last solution isn't safe enough because you still can execute

Runtime.getRuntime().exec(...)
phodal commented 1 year ago

However the last solution isn't safe enough because you still can execute

Runtime.getRuntime().exec(...)

Oops, thanks for the reminder! It will get same result.

import java.io.BufferedReader
import java.io.InputStreamReader

fun main() {
    try {
        val process = Runtime.getRuntime().exec("System.getenv()")
        val inputStream = process.inputStream
        val reader = BufferedReader(InputStreamReader(inputStream))

        var line: String?
        while (reader.readLine().also { line = it } != null) {
            println(line)
        }

        reader.close()
        process.waitFor()
    } catch (e: Exception) {
        e.printStackTrace()
    }
}