Cortys / unsafe-go-classifier

Automatic classification of "unsafe" usages in Go programs
MIT License
2 stars 0 forks source link

panic: Could not find non-cache file #1

Closed antoniozh closed 1 year ago

antoniozh commented 2 years ago

I tried to classify a line in a go file found by go-geiger and a critical error occured. Here is how I recreate the error:

antoniozhu@WS0075-21:/tmp/tmps726q7oa/grpc-go$ go-geiger --show-code . | grep strings_unsafe.go
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:17:8: Data unsafe.Pointer
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:21:8: Data unsafe.Pointer
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:33:24: src := (*sliceHeader)(unsafe.Pointer(&b))
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:34:25: dst := (*stringHeader)(unsafe.Pointer(&s))
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:45:25: src := (*stringHeader)(unsafe.Pointer(&s))
/mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal/strs/strings_unsafe.go:46:24: dst := (*sliceHeader)(unsafe.Pointer(&b))

My script gave me the following docker command: docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org/protobuf@v1.25.0/internal:/projects usgoc/pred:latest --project strs --line 17 --package strs --file strings_unsafe.go predict -m WL2GNN

But after checking the arguments, I changed them to the following arguments. docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org:/projects usgoc/pred:latest --project protobuf@v1.25.0 --line 17 --package strs --file strings_unsafe.go predict -m WL2GNN

I thought that the project should be pointing to the projects folder containing the go.mod file. Both commands did not work and I got the non-cache file error, so I figured that I should maybe add the relative file path to the file argument too.

docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org:/projects usgoc/pred:latest --project protobuf@v1.25.0 --line 17 --package strs --file internal/strs/strings_unsafe.go predict -m WL2GNN

But all commands resulted in the following error:

docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org:/projects usgoc/pred:latest --project protobuf@v1.25.0 --line 17 --package strs --file strings_unsafe.go predict -m WL2GNN
panic: Could not find non-cache file.

goroutine 1 [running]:
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cfg.GetCFG(0xc0001a1d48, {0x7ffcf9a61e6e, 0xc0001a1d78})
        /app/cfg/cfg.go:701 +0x878
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cmd.glob..func3(0xed4c20, {0xa4adf7, 0xc, 0xc})
        /app/cmd/cfg.go:23 +0xd9
github.com/spf13/cobra.(*Command).execute(0xed4c20, {0xc0001ca0c0, 0xc, 0xc})
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0xed56a0)
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x3ad
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cmd.Execute()
        /app/cmd/root.go:21 +0x25
main.main()
        /app/main.go:8 +0x17
Traceback (most recent call last):
  File "src/usgoc/run_prediction.py", line 147, in <module>
    cli(obj=dict())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 38, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "src/usgoc/run_prediction.py", line 117, in predict
    cfg = get_cfg_json(**obj)
  File "src/usgoc/run_prediction.py", line 36, in get_cfg_json
    encoding="utf-8")
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/bin/data-acquisition-tool', 'cfg', '--base', '/projects', '--project', 'protobuf@v1.25.0', '--package', 'strs', '--file', 'strings_unsafe.go', '--line', '17', '--snippet', '']' returned non-zero exit status 2.
antoniozh commented 2 years ago

It seems like that the data-acquisition-tool struggles to get a Go file inside folders of the project.

antoniozhu@WS0075-21:/mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit/dummy_project/dummy_package$ ls -d $PWD/*
/mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit/dummy_project/dummy_package/main.go
antoniozhu@WS0075-21:/mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit/dummy_project/dummy_package$ docker run --rm                     -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit:/projects                     usgoc/pred:latest --project dummy_project --line 11 --package dummy_package --file main.go predict -m WL2GNN
panic: Could not find non-cache file.

goroutine 1 [running]:
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cfg.GetCFG(0xc000121d48, {0x7ffeb335fe72, 0xc000121d78})
        /app/cfg/cfg.go:701 +0x878
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cmd.glob..func3(0xed4c20, {0xa4adf7, 0xc, 0xc})
        /app/cmd/cfg.go:23 +0xd9
github.com/spf13/cobra.(*Command).execute(0xed4c20, {0xc00014a0c0, 0xc, 0xc})
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846 +0x5f8
github.com/spf13/cobra.(*Command).ExecuteC(0xed56a0)
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 +0x3ad
github.com/spf13/cobra.(*Command).Execute(...)
        /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
github.com/stg-tud/unsafe_go_study_results/scripts/data-acquisition-tool/cmd.Execute()
        /app/cmd/root.go:21 +0x25
main.main()
        /app/main.go:8 +0x17
Traceback (most recent call last):
  File "src/usgoc/run_prediction.py", line 147, in <module>
    cli(obj=dict())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 38, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "src/usgoc/run_prediction.py", line 117, in predict
    cfg = get_cfg_json(**obj)
  File "src/usgoc/run_prediction.py", line 36, in get_cfg_json
    encoding="utf-8")
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/bin/data-acquisition-tool', 'cfg', '--base', '/projects', '--project', 'dummy_project', '--package', 'dummy_package', '--file', 'main.go', '--line', '11', '--snippet', '']' returned non-zero exit status 2.

When I move the file to the root directory of the project, it works:

antoniozhu@WS0075-21:/mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit/dummy_project/dummy_package$ mv main.go ..
antoniozhu@WS0075-21:/mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit/dummy_project/dummy_package$ docker run --rm                     -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build -v /mnt/c/Users/antonio.zhu/Documents/repos/praktikum/unsafe-toolkit:/projects                     usgoc/pred:latest --project dummy_project --line 11 --package dummy_package --file main.go predict -m WL2GNN
2022-02-10 17:52:47.688147: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:1395: calling gather (from tensorflow.python.ops.array_ops) with validate_indices is deprecated and will be removed in a future version.
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
2022-02-10 17:52:48.393910: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2022-02-10 17:52:48.394238: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2495995000 Hz
[{"cast-basic": 8.219590199587401e-07, "cast-bytes": 1.8294153747433484e-08, "cast-header": 9.329493195764371e-07, "cast-pointer": 6.143737323327514e-07, "cast-struct": 1.0154550182051025e-05, "definition": 0.00011404613178456202, "delegate": 0.9998230338096619, "memory-access": 2.007181137742009e-05, "pointer-arithmetic": 2.608248541946523e-05, "syscall": 3.7869078823860036e-06, "unused": 4.3619456846499816e-07}, {"atomic": 0.0034588424023240805, "efficiency": 7.757153070997447e-06, "ffi": 0.9953190684318542, "generics": 0.00041620503179728985, "hide-escape": 2.2497886675409973e-05, "layout": 0.0005000202800147235, "no-gc": 4.292558878660202e-05, "reflect": 4.836152129428228e-06, "serialization": 0.0002216835127910599, "types": 2.287283450641553e-06, "unused": 3.897362603311194e-06}]
Cortys commented 2 years ago

Hi! This issue is caused by the fact that the CFG generator retrieves the source files of a package via Go's package.Load function. This means that the string provided via the --package CLI argument must be a valid fully-qualified Go package name (i.e. using the same format that is also used in Go's import statement).

For your particular issue, the problem should therefore be fixable by using --project protobuf@v1.25.0 --package google.golang.org/protobuf/internal/strs with the volume mount -v /mnt/c/Users/antonio.zhu/go/pkg/mod/google.golang.org:/projects.

In case the way project lookup works is unclear: It does not matter how you partition the project path between the /projects volume-mount and the --project argument, i.e. -v /a/b:/projects + --project c is completely equivalent to -v /a:/projects + --project b/c. Also, note that the combined project dir (/projects mountpoint concatenated with the --project argument) should typically always contain a go.mod file; elsewise the fully-qualified module name for the packages within the project and the version numbers for project-external package dependencies are unknown.

I hope this helps. Let me know if you need any further information. :slightly_smiling_face:

antoniozh commented 2 years ago

Hi, this works! Since I generate the arguments with Python, would you know of any way the go-geiger output could be used to parse the fully qualified package name? Is the fully qualified package name the path of the package relative to the projects folder path?

Cortys commented 2 years ago

Great! Since I'm not very familiar with the go-geiger tool, I'm not sure what the easiest way to obtain the full package name is; maybe @gh0st42 or @akwick have an idea. 🙂

Alternatively, as you suggested, deriving the package name from the path in go/pkg/mod should probably also be possible. With this approach you would have to remove the version tags from module directories (e.g. the @v1.25.0 part).

antoniozh commented 2 years ago

Hi, there are some exceptions to the fully qualified package name. For example when I build the following project and try to analyze the files the package name is one that I would not find in the file system. Is there any other way to find out the package name for a Go file?

This is the generated command:

docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build usgoc/pred:latest --project crypto@v0.0.0-20220215181150-74469fa99b22 --line 18 --package github.com/drakkan/crypto/internal/subtle --file aliasing.go --base /root/go/pkg/mod/github.com/drakkan predict -m WL2GNN

This is the fixed command, I found the correct package name in a comment in the aliasing.go file.

docker run --rm -v go_mod:/root/go/pkg/mod -v go_cache:/root/.cache/go-build usgoc/pred:latest --project crypto@v0.0.0-20220215181150-74469fa99b22 --line 18 --package golang.org/x/crypto/internal/subtle --file aliasing.go --base /root/go/pkg/mod/github.com/drakkan predict -m WL2GNN

Cortys commented 2 years ago

Yes, good point. I hadn't thought about forks. The path-based approach to derive the package name of a file is not sufficient then.

If you can't get the package name from the go-geiger tool, getting the module name from the go.mod file in the --project dir is probably the easiest solution (assuming, of course, that the file with the unsafe usage always belongs to the project and is not part of some dependency of it). A simple Regex matching ^module (.+) might already be sufficient.