CentaurusInfra / alnair

Intelligent platform for AI workloads
Apache License 2.0
37 stars 12 forks source link

intercept-lib test instruction doesn't work. #132

Open awang088 opened 2 years ago

awang088 commented 2 years ago

Following the instruction to manually test interpose lib. it doesn't run through and fails at here:

xxx@fw0016741:/home/xxx/dev/alnair/alnair-device-plugin$ sudo go run cmd/alnair-device-plugin/main.go cmd/alnair-device-plugin/main.go:4:2: cannot find package "alnair-device-plugin/pkg/devicepluginserver" in any of: /usr/lib/go-1.10/src/alnair-device-plugin/pkg/devicepluginserver (from $GOROOT) /home/steven/go/src/alnair-device-plugin/pkg/devicepluginserver (from $GOPATH) xxx@fw0016741:/home/xxx/dev/alnair/alnair-device-plugin$

how to fix this?

Fizzbb commented 2 years ago

Assume you have a two-node kubernetes cluster. 1) install alnair-device-plugin (this way you don't need to build and launch alnair device plugin and vgpu-server manually) kubectl apply -f https://raw.githubusercontent.com/CentaurusInfra/alnair/main/alnair-device-plugin/manifests/alnair-device-plugin.yaml verify alnair-device-plugin-daemonset-XXXXXX pod is installed under kube-system namespace on the GPU node 2) On the worker GPU node, replace /opt/alnair/libcuinterpose.so with newly built "libcuinterpose.so" 3) launch test pod with alnair/vgpu-memory:4 request (request 4GB gpu memory), 4) bash into the pod, check out the files used/generated by intercept lib in the /var/lib/alnair/workspace