devcontainers / images

Repository for pre-built dev container images published under mcr.microsoft.com/devcontainers
https://containers.dev
MIT License
1.28k stars 466 forks source link

[Golang dev-images] The temporary file that causes shell scripts to fail #566

Open Aisuko opened 1 year ago

Aisuko commented 1 year ago

Hi, guys. Thanks for working on this project. I am a big fan of the dev-containers project. I have many of experience with dev-containers. I hit an issue when I use the sed command in the mcr.microsoft.com/devcontainers/go:0-1.20-bullseye container. I always get a temporary file and it causes shell scripts to fail.

However, it works well on my local laptop(M1 Pro). So, please help me figure out the reason, thanks.

Shell scripts

## GPT4ALL
gpt4all:
    git clone --recurse-submodules $(GPT4ALL_REPO) gpt4all
    cd gpt4all && git checkout -b build $(GPT4ALL_VERSION) && git submodule update --init --recursive --depth 1
    # This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
    @find ./gpt4all -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
    @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_/gptj_/g' {} +
    @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/gpt_/gptj_/g' {} +
    @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/set_console_color/set_gptj_console_color/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/set_console_color/set_gptj_console_color/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
    @find ./gpt4all -type f -name "*.go" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
    @find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
    @find ./gpt4all -type f -name "*.txt" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/json_/json_gptj_/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/void replace/void json_gptj_replace/g' {} +
    @find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/::replace/::json_gptj_replace/g' {} +
    mv ./gpt4all/gpt4all-backend/llama.cpp/llama_util.h ./gpt4all/gpt4all-backend/llama.cpp/gptjllama_util.h

Configuration see below.

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose
{
    "name": "Existing Docker Compose (Extend)",

    // Update the 'dockerComposeFile' list if you have more compose files or use different names.
    // The .devcontainer/docker-compose.yml file contains any overrides you need/want to make.
    "dockerComposeFile": [
        "../docker-compose.yaml",
        "docker-compose.yml"
    ],

    // The 'service' property is the name of the service for the container that VS Code should
    // use. Update this value and .devcontainer/docker-compose.yml to the real service name.
    "service": "api",

    // The optional 'workspaceFolder' property is the path VS Code should open by default when
    // connected. This is typically a file mount in .devcontainer/docker-compose.yml
    "workspaceFolder": "/workspace",

    "features": {
        "ghcr.io/devcontainers/features/go:1": {},
        "ghcr.io/azutake/devcontainer-features/go-packages-install:0": {}
    },

    // Features to add to the dev container. More info: https://containers.dev/features.
    // "features": {},

    // Use 'forwardPorts' to make a list of ports inside the container available locally.
    // "forwardPorts": [],

    // Uncomment the next line if you want start specific services in your Docker Compose config.
    // "runServices": [],

    // Uncomment the next line if you want to keep your containers running after VS Code shuts down.
    // "shutdownAction": "none",

    // Uncomment the next line to run commands after the container is created.
    "postCreateCommand": "make prepare"

    // Configure tool-specific properties.
    // "customizations": {},

    // Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.
    // "remoteUser": "devcontainer"
}

Docker-compose file

version: '3.6'
services:
  # Update this to the name of the service you want to work with in your docker-compose.yml file
  api:
    # Uncomment if you want to override the service's Dockerfile to one in the .devcontainer 
    # folder. Note that the path of the Dockerfile and context is relative to the *primary* 
    # docker-compose.yml file (the first in the devcontainer.json "dockerComposeFile"
    # array). The sample below assumes your primary file is in the root of your project.
    #
    build:
      context: .
      dockerfile: .devcontainer/Dockerfile

    volumes:
      # Update this to wherever you want VS Code to mount the folder of your project
      - .:/workspace:cached

    # Uncomment the next four lines if you will use a ptrace-based debugger like C++, Go, and Rust.
    # cap_add:
    #   - SYS_PTRACE
    # security_opt:
    #   - seccomp:unconfined

    # Overrides default command so things don't shut down after the process ends.
    command: /bin/sh -c "while sleep 1000; do :; done"

Dockerfile

ARG GO_VERSION=1.20
FROM mcr.microsoft.com/devcontainers/go:0-$GO_VERSION-bullseye
RUN apt-get update && apt-get install -y cmake
vscode ➜ /workspace (master) $ make gpt4all
git clone --recurse-submodules https://github.com/go-skynet/gpt4all gpt4all
Cloning into 'gpt4all'...
remote: Enumerating objects: 3126, done.
remote: Counting objects: 100% (468/468), done.
remote: Compressing objects: 100% (88/88), done.
remote: Total 3126 (delta 418), reused 395 (delta 379), pack-reused 2658
Receiving objects: 100% (3126/3126), 9.13 MiB | 10.80 MiB/s, done.
Resolving deltas: 100% (2017/2017), done.
Submodule 'llama.cpp' (https://github.com/manyoso/llama.cpp.git) registered for path 'gpt4all-backend/llama.cpp'
Cloning into '/workspace/gpt4all/gpt4all-backend/llama.cpp'...
remote: Enumerating objects: 1977, done.        
remote: Counting objects: 100% (777/777), done.        
remote: Compressing objects: 100% (57/57), done.        
remote: Total 1977 (delta 732), reused 720 (delta 720), pack-reused 1200        
Receiving objects: 100% (1977/1977), 2.02 MiB | 8.33 MiB/s, done.
Resolving deltas: 100% (1281/1281), done.
Submodule path 'gpt4all-backend/llama.cpp': checked out '03ceb39c1e729bed4ad1dfa16638a72f1843bf0c'
cd gpt4all && git checkout -b build a330bfe26e9e35ca402e16df18973a3b162fb4db && git submodule update --init --recursive --depth 1
Switched to a new branch 'build'
# This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
sed: couldn't open temporary file ./gpt4all/gpt4all-backend/llama.cpp/tests/sedCQKLDZ: Permission denied
make: *** [Makefile:46: gpt4all] Error 1

vscode ➜ /workspace (master) $ ls -la  ./gpt4all/gpt4all-backend/llama.cpp/tests/
ls: cannot access './gpt4all/gpt4all-backend/llama.cpp/tests/sedCQKLDZ': No such file or directory
total 32
drwxr-xr-x  8 vscode vscode   256 May 16 09:06 .
drwxr-xr-x 40 vscode vscode  1280 May 16 09:06 ..
-rw-r--r--  1 vscode vscode   498 May 16 09:06 CMakeLists.txt
-?????????  ? ?      ?          ?            ? sedCQKLDZ
-rw-r--r--  1 vscode vscode  1733 May 16 09:06 test-double-float.c
-rw-r--r--  1 vscode vscode  5156 May 16 09:06 test-quantize-fns.cpp
-rw-r--r--  1 vscode vscode 11553 May 16 09:06 test-quantize-perf.cpp
-rw-r--r--  1 vscode vscode  2680 May 16 09:06 test-tokenizer-0.cpp
Aisuko commented 1 year ago

It looks like -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} + will create sedCQKLDZ somehow.

Aisuko commented 1 year ago

In a container, the sed command with -i'' subcommand will create a temporary file in place but it will save into the same folder. And it will cause this issue. So, it works well after replacing -i: -exec sh -c "sed 's/ggml_/ggml_bert_/g' {} > {}.tmp && mv {}.tmp {}" \;.

However, I still do not know the reason it was failed in the container environment.