Understanding the purpose of `files`

zekenie commented 9 months ago

I'm really excited about this project and considering using it. I'm curious what the purpose of files is in the http api, and I couldn't find it documented anywhere. In every example I see it looks something like this

{
  ...
  files: { "": "someCode()" }
}

What else could go in files? If I wanted to include some data to be available in a sandbox, could it be in files?

TravisRoad commented 9 months ago

in internal/engine/docker.go:168, when the filename equals to "", it just assigns the Entry value (default main.sh or main.py ...) to the name. https://github.com/nalgeon/codapi/blob/3fd59120f57cdf381b86fd1dc9f9c28b91679530/internal/engine/docker.go#L168C1-L179C2

// writeFiles writes request files to the temporary directory.
func (e *Docker) writeFiles(dir string, files Files) error {
    var err error
    files.Range(func(name, content string) bool {
        if name == "" {
            name = e.cmd.Entry
        }
        path := filepath.Join(dir, name)
        err = fileio.WriteFile(path, content, 0444)
        return err == nil
    })
    return err
}

In this way files: { "": "someCode()" } just equals to files: { "main.py": "someCode()" }. And I believe you can upload as many files as you want through the files field.

nalgeon commented 9 months ago

Sure, files can contain data. Here is an example of a data.txt file that accompanies a Python script main.py:

data.txt:

one
two
three

main.py:

for line in open("data.txt"):
    print(line)

API request:

{
    "sandbox": "python",
    "command": "run",
    "files": {
        "main.py": "for line in open(\"data.txt\"): print(line)",
        "data.txt": "one\ntwo\nthree"
    }
}

API response:

{
  "id": "python_run_c152676b",
  "ok": true,
  "duration": 687,
  "stdout": "one\n\ntwo\n\nthree\n",
  "stderr": ""
}

zekenie commented 9 months ago

Fantastic, thanks for the explanation. Do you want any pull requests on docs stuff as I discover things?

nalgeon commented 9 months ago

The docs are in the early stages, and I'm not sure about the final structure. So it's probably best if you ask questions as they come up, and I'll answer them and eventually come up with a doc :)

zekenie commented 9 months ago

is there any way to give a container access to volumes. (I know in many cases this would be a terrible idea)

nalgeon commented 9 months ago

Could you please describe the use case in more detail?

zekenie commented 9 months ago

It's a little hard to explain, but I'm building a prototype of this idea. It's a chat product and a repl. Every channel is supposed to be a directory on the filesystem. I want to be able to have chats like this. I also think this format will be great for LLMs.

person0: Hey @ person1 what do you think we should do first? person1: let's fetch some data from an api person0: (js) await fetch('...') person0: now what? person1: let's read the sqlite db in this dir and compare each result to the data from the api, then we can make a chart with vegalite? person0: ...

Importantly, when i execute code, it's code I write on my computer. I'm thinking that every channel (or folder) could have a docker file which plugs into a codapi sandbox. The "standard lib" let's do a set of things but if you want to make a particular channel have special behavior you can just extend the docker file. I also want to be able to have the commands read/write from a shared volume if configured to do so

zekenie commented 9 months ago

I'm going to make the "standard" runtime have access to an empty sqlite db, vega lite, http, etc. but if someone wanted to create a channel with another sandbox that did something in python, that'd be fine

nalgeon commented 9 months ago

I'm afraid I don't get it, sorry :)

Anyway, if you want persistent volumes, you can prepare and run a persistent container with volumes using docker run with --volume argument, or using docker compose up with proper configuration.

Then, on the Codapi side, you can setup a sandbox to execute commands in this persistent container. Here is an example of a sandbox configuration file (e.g. configs/commands/python.json):

{
    "run": {
        "engine": "docker",
        "entry": "main.py",
        "steps": [
            {
                "box": "python",
                "action": "exec",
                "command": ["python", "/path/to/main.py"]
            }
        ]
    }
}

Where:

python is the name of the running Docker container.
/path/to/ is the path inside the container that you've mapped to the path on the host machine using Docker volumes.

zekenie commented 9 months ago

i see! i think i thought that python was an image name, but it's a container name and codapi doesn't handle booting your containers. that makes sense

nalgeon commented 9 months ago

Note the action = exec parameter, it makes a big difference. It means that the step will be executed using docker exec, so in this case python is the name of the existing (and running) container.

The default value for action is different (action = run). It means that the step will be executed using docker run, so in this case python would be the name of the box defined in configs/boxes.json that maps to an image.

zekenie commented 9 months ago

i see, that clarifies things, thank you!

nalgeon / codapi

Understanding the purpose of `files` #2