grafana / plugin-tools

Create Grafana plugins with ease.
https://grafana.com/developers/plugin-tools/
Apache License 2.0
55 stars 27 forks source link

Backend development environment not working on ARM64 macOS #913

Closed oshirohugo closed 16 hours ago

oshirohugo commented 1 month ago

on macOS, after running:

npx @grafana/create-plugin@latest --pluginName=test \
  --pluginDescription="test plugin" \
  --orgName=test \
  --pluginType=datasource \
  --hasBackend \
  --hasGithubWorkflows \
  --hasGithubLevitateWorkflow && \
cd test-test-datasource && \
npm i && \
npm run build && \
DEVELOPMENT=true npm run server

we get the following error:

test-test-datasource  | 2024-05-13 12:15:25,832 INFO spawned: 'delve' with pid 658
test-test-datasource  | API server listening at: [::]:2345
test-test-datasource  | 2024-05-13T12:15:25Z warning layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
test-test-datasource  | Warning: no debug info found, some functionality will be missing such as stack traces and variable evaluation.
test-test-datasource  | could not attach to pid 646: could not read debug info (decoding dwarf section info at offset 0x0: too short) and could not read go symbol table (could not read section .gopclntab)
test-test-datasource  | 2024-05-13 12:15:25,992 WARN exited: delve (exit status 1; not expected)
test-test-datasource  | 2024-05-13 12:15:26,996 INFO gave up: delve entered FATAL state, too many start retries too quickly

Grafana launchs on port 3000, but the plugin isn't usable and it's not possible to connect to a debugger client to delve on port 2345.

The problem seem to be related to the following kernel flags:

    security_opt:
      - 'apparmor:unconfined'
      - 'seccomp:unconfined'
    cap_add:
      - SYS_PTRACE

used to allow the delve to connect to the running plugin and a remote debug client.

We used the following command to verify that the flags were working:

docker exec -it test-test-datasource cat /proc/1/attr/current
docker exec -it test-test-datasource cat /proc/1/status | grep Seccomp
docker exec -it test-test-datasource cat /proc/1/status | grep Cap
docker exec -it test-test-datasource uname -m

The result in Linux (where everything works) is:

$ docker exec -it test-test-datasource cat /proc/1/attr/current
unconfined
$ docker exec -it test-test-datasource cat /proc/1/status | grep Seccomp
Seccomp:        0
Seccomp_filters:        0
$ docker exec -it test-test-datasource cat /proc/1/status | grep Cap
CapInh: 0000000000000000
CapPrm: 00000000a80c25fb
CapEff: 00000000a80c25fb
CapBnd: 00000000a80c25fb
CapAmb: 0000000000000000

on macOS:

➜  ~ docker exec -it test-test-datasource cat /proc/1/attr/current
cat: read error: Invalid argument
➜  ~ docker exec -it test-test-datasource cat /proc/1/status | grep Seccomp
Seccomp:    0
Seccomp_filters:    0
➜  ~ docker exec -it test-test-datasource cat /proc/1/status | grep Cap
CapInh: 0000000000000000
CapPrm: 00000000a80c25fb
CapEff: 00000000a80c25fb
CapBnd: 00000000a80c25fb
CapAmb: 0000000000000000

Therefore it seems like the file proc/1/attr/current is missing on macOS.

Also the commands

docker inspect --format '{{ .AppArmorProfile }}' test-test-datasource

shows nothing in macOS while it shows unconfined in linux.

But the command

docker inspect --format '{{ .HostConfig.SecurityOpt }}' test-test-datasource

shows [apparmor:unconfined seccomp:unconfined] on both platforms

jackw commented 2 weeks ago

Therefore it seems like the file /proc/1/attr/current is missing on macOS.

There is no concept of /proc on macOS. From digging into it a bit I'm not sure that is the issue here. From playing around with it locally and looking at bug reports related to not being able to debug with delve it seems the issue is a mismatch of amd64 and arm64. If I set the GO_ARCH to arm64 and remove the platform: 'linux/amd64' from the docker-compose.yaml it seems I can successfully debug using dlv on my M2 macbook. I guess using different arch of golang and dlv probably isn't compatible. Additionally I believe that mac os will use rosetta for amd64 support and possibly this interferes with being able to access port 2345.

docker logs

test-test-datasource  | logger=live t=2024-06-17T11:03:03.305004005Z level=debug msg="Client connected" user= client=cfa74495-546d-4a4b-8bf4-f48dcc0ef5d2
test-test-datasource  | logger=authn.service t=2024-06-17T11:03:03.316990171Z level=warn msg="Failed to authenticate request" client=auth.client.session error="user token not found"
test-test-datasource  | logger=anonymous-session-service t=2024-06-17T11:03:03.317495546Z level=debug msg="Tagging device for UI" deviceID=c06ff1653239890482a34b292e864989 device="unsupported value type" key=anon-device:c06ff1653239890482a34b292e864989
test-test-datasource  | logger=token t=2024-06-17T11:03:03.322284255Z level=debug msg=FeatureEnabled feature=accesscontrol.enforcement enabled=false licenseStatus=NotFound hasLicense=false hasValidLicense=false products=[]
test-test-datasource  | 2024-06-17 11:03:05,334 INFO spawned: 'delve' with pid 724
test-test-datasource  | API server listening at: [::]:2345
test-test-datasource  | 2024-06-17T11:03:05Z warning layer=rpc Listening for remote connections (connections are not authenticated nor encrypted)
test-test-datasource  | 2024-06-17 11:03:06,353 INFO success: delve entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
test-test-datasource  | logger=ngalert.scheduler t=2024-06-17T11:03:10.006499633Z level=debug msg="Alert rules fetched" rulesCount=0 foldersCount=0 updatedRules=0

local terminal

dlv connect localhost:2345
Type 'help' for list of commands.
(dlv) break datasource.go:100
Breakpoint 1 set at 0xd622f8 for github.com/test/test/pkg/plugin.(*Datasource).CheckHealth() /root/test-test-datasource/pkg/plugin/datasource.go:100
(dlv) continue
> github.com/test/test/pkg/plugin.(*Datasource).CheckHealth() /root/test-test-datasource/pkg/plugin/datasource.go:100 (hits goroutine(9):1 total:1) (PC: 0xd622f8)
(dlv) continue
jackw commented 1 week ago

It appears the addition of platform: linux/amd64 was added to fix a bug with apple silicon and docker. Maybe a better solution to that problem would be to detect the OS in create-plugin at scaffold time and print the command based on OS. e.g make build:Linux / make build:Darwin etc rather than force docker to use a specific platform?

grafana-plugins-platform-bot[bot] commented 16 hours ago

:rocket: Issue was released in @grafana/create-plugin@4.14.1 :rocket: