linode / linode-blockstorage-csi-driver

Container Storage Interface (CSI) Driver for Linode Block Storage
Apache License 2.0
64 stars 54 forks source link

[feat] E2E tests using Chainsaw #178

Closed komer3 closed 2 months ago

komer3 commented 2 months ago

General:

Pull Request Guidelines:

  1. [] Does your submission pass tests?
  2. [] Have you added tests?
  3. [ ] Are you addressing a single feature in this PR?
  4. [ ] Are your commits atomic, addressing one change per commit?
  5. [ ] Are you following the conventions of the language?
  6. [ ] Have you saved your large formatting changes for a different PR, so we can focus on your work?
  7. [ ] Have you explained your rationale for why this feature is needed?
  8. [ ] Have you linked your PR to an open issue

PR Purpose: Improve CSI Driver Testing Infrastructure This PR addresses two key issues with our current CSI driver testing:

  1. Lack of easily maintainable end-to-end (e2e) tests
  2. Absence of e2e tests in GitHub Actions (GHA) PR workflows

Changes implemented:

Benefits:

These changes aim to streamline our testing process and catch potential issues earlier in the development cycle.

Usage: Please take a look at the new readme to run and test the changes locally

rahulait commented 2 months ago

Running the tests twice, it worked first time and then failed the next time due to timing errors.

--- FAIL: chainsaw (0.00s)
    --- PASS: chainsaw/pod-pvc-basic-filesystem (99.00s)
    --- PASS: chainsaw/pod-pvc-expand-storage (119.78s)
    --- FAIL: chainsaw/pod-pvc-expand-raw-block-storage (308.47s)
    --- FAIL: chainsaw/pod-pvc-create-ext4-filesystem (308.47s)
    --- PASS: chainsaw/statefulset-pvc (320.55s)
FAIL
Tests Summary...
- Passed  tests 3
- Failed  tests 2
- Skipped tests 0
Done with failures.
Error: some tests failed
make: *** [Makefile:120: e2e-test] Error 1
Error: error running script "e2e-test" in Devbox: exit status 2

Pods of failed tests remained in containercreating state for 5 mins:

Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Warning  FailedScheduling        4m31s (x2 over 4m38s)  default-scheduler        0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
  Normal   Scheduled               4m29s                  default-scheduler        Successfully assigned chainsaw-polite-mink/e2e-pod to csi-driver-cluster-6a210d0-control-plane-vznwl
  Normal   SuccessfulAttachVolume  4m13s                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-bd4572901c6d49d6"
  Warning  FailedMapVolume         2s (x10 over 4m12s)    kubelet                  MapVolume.MapPodDevice failed for volume "pvc-bd4572901c6d49d6" : rpc error: code = Internal desc = NodePublishVolume mount of disk failed: mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t ext4 -o bind /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/publish/pvc-bd4572901c6d49d6/d46c4d50-0367-46f8-89c9-0c595eadecc0
Output: mount: can't find /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/publish/pvc-bd4572901c6d49d6/d46c4d50-0367-46f8-89c9-0c595eadecc0 in /etc/fstab
rahsharm@bos-lhv8wy:~/linode-blockstorage-csi-driver$

I wonder if there is some race condition here or we need to wait when the volume is attached? I have seen this happen 2 times of 5 times I tried to run e2e-tests.

rahulait commented 2 months ago

Overall, the PR looks good to me. Not sure if we want to dig further into why it fails sometimes.

nesv commented 2 months ago

Before merging, please don't forget to squash out all of your fixup commits.