Failing K8s tests - Githubissues

LindseySaari commented 1 year ago

Description

In the K8s branch, during the automatic merge from master, there's some tests that fail (looks like something related to an undefined constant).

The same tests also fail in branches that are branched off of k8s - https://github.com/department-of-veterans-affairs/vets-api/pull/12639

Example:

Failures:

  1) EducationForm::Forms::VA1990e 1990e simple spool test generates the spool file correctly
     Failure/Error: windows_linebreak = EducationForm::WINDOWS_NOTEPAD_LINEBREAK

     NameError:
       uninitialized constant EducationForm::WINDOWS_NOTEPAD_LINEBREAK

Failed examples:

rspec './spec/jobs/education_form/forms/va1990e_spec.rb[1:2:1]' # EducationForm::Forms::VA1990e 1990e simple spool test generates the spool file correctly
rspec './spec/jobs/education_form/forms/va1990e_spec.rb[1:1:1]' # EducationForm::Forms::VA1990e 1990e kitchen_sink spool test generates the spool file correctly
rspec './spec/jobs/education_form/forms/va1995_spec.rb[1:2:1]' # EducationForm::Forms::VA1995 1995 kitchen_sink spool test generates the spool file correctly
rspec ./spec/jobs/education_form/forms/va1995_spec.rb:20 # EducationForm::Forms::VA1995#direct_deposit_type converts internal keys to text
rspec './spec/jobs/education_form/forms/va1995_spec.rb[1:1:1]' # EducationForm::Forms::VA1995 1995 minimal spool test generates the spool file correctly
rspec './modules/appeals_api/spec/services/appeals_api/pdf_construction/generator_spec.rb[1:1:3:2:4:1]' # AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf

holdenhinkle commented 1 year ago

I added the platform-sev-s0 label because this issue is blocking code from being merged into k8s branch.

holdenhinkle commented 1 year ago

Trevor might have fixed these failing tests in this PR - https://github.com/department-of-veterans-affairs/vets-api/pull/12712#pullrequestreview-1431860325

holdenhinkle commented 1 year ago

Trevor's PR fixed a handful of the failures, but there are still other undefined constant failures.

The failures seem flakey. Some commits don't have any failures, some have the same 2 or 3 failures. Some just 1. Etc., etc..

commit/run - https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5016049494/jobs/8992415772

Failures:

  1) BankName.get_bank_name with cached bank name returns the cached name
     Failure/Error: return if routing_number.blank? || routing_number == BGS::Service::EMPTY_ROUTING_NUMBER

     NameError:
       uninitialized constant BGS::Service
     # ./app/models/bank_name.rb:16:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:21:in `block (4 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `loop'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'

  2) BankName.get_bank_name with cache miss returns the bank name and saves it in cache
     Failure/Error: return if routing_number.blank? || routing_number == BGS::Service::EMPTY_ROUTING_NUMBER

     NameError:
       uninitialized constant BGS::Service
     # ./app/models/bank_name.rb:16:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:29:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:34:in `block (5 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/vcr-6.1.0/lib/vcr/util/variable_args_block_caller.rb:9:in `call_block'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/vcr-6.1.0/lib/vcr.rb:194:in `use_cassette'
     # ./spec/models/bank_name_spec.rb:33:in `block (4 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `loop'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'

Finished in 2 minutes 25.7 seconds (files took 14.24 seconds to load)
2258 examples, 2 failures, 4 pending

Failed examples:

rspec ./spec/models/bank_name_spec.rb:20 # BankName.get_bank_name with cached bank name returns the cached name
rspec ./spec/models/bank_name_spec.rb:32 # BankName.get_bank_name with cache miss returns the bank name and saves it in cache

commit/run - https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5015784255/jobs/8991805573

Failures:

  1) AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf
     Failure/Error: expect(generated_pdf).to match_pdf(expected_pdf)

Failed examples:

rspec './modules/appeals_api/spec/services/appeals_api/pdf_construction/generator_spec.rb[1:1:3:2:4:1]' # AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf

commit/run - https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5015706301/jobs/8991619402

Failures:

  1) BankName.get_bank_name with cached bank name returns the cached name
     Failure/Error: return if routing_number.blank? || routing_number == BGS::Service::EMPTY_ROUTING_NUMBER

     NameError:
       uninitialized constant BGS::Service
     # ./app/models/bank_name.rb:16:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:21:in `block (4 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `loop'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'

  2) BankName.get_bank_name with cache miss returns the bank name and saves it in cache
     Failure/Error: return if routing_number.blank? || routing_number == BGS::Service::EMPTY_ROUTING_NUMBER

     NameError:
       uninitialized constant BGS::Service
     # ./app/models/bank_name.rb:16:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:29:in `get_bank_name'
     # ./spec/models/bank_name_spec.rb:34:in `block (5 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/vcr-6.1.0/lib/vcr/util/variable_args_block_caller.rb:9:in `call_block'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/vcr-6.1.0/lib/vcr.rb:194:in `use_cassette'
     # ./spec/models/bank_name_spec.rb:33:in `block (4 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `loop'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'

Finished in 2 minutes 16.7 seconds (files took 14.26 seconds to load)
2258 examples, 2 failures, 4 pending

And

Failures:

  1) AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf
     Failure/Error: expect(generated_pdf).to match_pdf(expected_pdf)

commit/run - https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5015223769/jobs/8990493144

Failures:

  1) AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf
     Failure/Error: expect(generated_pdf).to match_pdf(expected_pdf)

holdenhinkle commented 1 year ago

PR to fix the two bank_name_spec.rb examples - https://github.com/department-of-veterans-affairs/vets-api/pull/12726

holdenhinkle commented 1 year ago

There's definitely some flakiness happening when running tests in the k8s branch. Consider this merge to k8s - https://github.com/department-of-veterans-affairs/vets-api/commit/3fd71699794d6326e2c25dbbe80ab453f2404939

I ran the failed jobs 3 times, and each time there was a different number of failed tests:

https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5024464556/jobs/9010850812

14955 examples, 5 failures, 30 pendings

https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5024464556/jobs/9010216878

14955 examples, 7 failures, 30 pendings

https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5024464556/jobs/9011402790

14955 examples, 16 failures, 30 pendings

holdenhinkle commented 1 year ago

Analyzing the difference between the GitHub Actions code_checks.yml tests job's steps in master with the equivalent steps ins k8s - https://github.com/department-of-veterans-affairs/vets-api/compare/master...k8s#diff-9ac4e287711080732ab9e793ba196754b133057ca853bbfda5ba34fba8b9dbc9:

master:

name: Build Docker Image
        uses: docker/build-push-action@v4
        with:
          build-args: |
            sidekiq_license=${{ env.BUNDLE_ENTERPRISE__CONTRIBSYS__COM }}
            userid=${{ env.VETS_API_USER_ID }}
          context: .
          target: builder
          push: false
          load: true
          tags: vets-api
          cache-from: type=gha
          cache-to: type=gha,mode=max

      - name: Setup Database
        run: |
             docker-compose -f docker-compose.test.yml run vets-api bash \
             -c "CI=true RAILS_ENV=test DISABLE_BOOTSNAP=true parallel_test -n 13 -e 'bin/rails db:reset'"
      - name: Run Specs
        timeout-minutes: 20
        run: |
             docker-compose -f docker-compose.test.yml run vets-api bash \
             -c "CI=true DISABLE_BOOTSNAP=true bundle exec parallel_rspec spec/ modules/ -n 13 -o '--color --tty'"
      - name: Upload Coverage Report
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: Coverage Report
          path: coverage
      - name: Upload Test Results
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: Test Results
          path: log/*.xml
          if-no-files-found: ignore

k8s:

    - name: Build Docker Image
        uses: docker/build-push-action@v4
        with:
          build-args: |
            BUNDLE_ENTERPRISE__CONTRIBSYS__COM=${{ env.BUNDLE_ENTERPRISE__CONTRIBSYS__COM }}
            USER_ID=${{ env.USER_ID }}
          context: .
          push: false
          load: true
          tags: vets-api
          cache-from: type=gha
          cache-to: type=gha,mode=max

      - name: Setup Database and Run Specs
        run: |
          docker-compose run web bash \
          -c "CI=true RAILS_ENV=test DISABLE_BOOTSNAP=true; \
              bundle exec parallel_test -n 13 -e 'bin/rails db:reset' \
              && bundle exec parallel_rspec spec/ modules/ -n 8 -o '--color --tty'"
      - name: Upload Coverage Report
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: Coverage Report
          path: coverage
      - name: Upload Test Results
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: Test Results
          path: log/*.xml
          if-no-files-found: ignore

holdenhinkle commented 1 year ago

There are minor differences between the master branch and the k8s branch but perhaps they might be causing the discrepancies we're observing in our tests.

In the k8s branch, both parallel_test and db:reset are being run as part of the same command. However, they are being executed sequentially due to the use of the && operator. This is different from the master branch where these commands are being run in separate steps.

The impact of this can be significant depending on the nature of our tests. Here's why:

In the master branch:

db:reset and parallel_test are run in separate steps and potentially different containers. This ensures that each step is isolated and has its own clean environment.
When db:reset is run, the database is set up with a known state before the tests run. This is done in isolation and no tests are being run that could potentially change the state of the database.
Then, parallel_test is run in its own environment. Any state created during this step won't affect any other steps because each step runs in its own environment.

In the k8s branch:

db:reset and parallel_test are run in the same command. This means they share the same environment and any state created in one could potentially affect the other.
While db:reset is still run before parallel_test due to the use of the && operator, they are not isolated from each other because they are part of the same command.

As a result, the tests could behave differently between the two branches. If any of our tests rely on some kind of global state or have side effects that might affect other tests, this could potentially cause the tests to pass in the master branch but fail in the k8s branch.

One way to isolate the issue is to separate db:reset and parallel_test into different commands or steps on the k8s branch, similar to how they are on the master branch. This would help us understand if the issue is related to them running in the same environment or if it's something else.

holdenhinkle commented 1 year ago

I'm going to work at updating the steps for running the tests in the test job in the k8s branch when I get back.

holdenhinkle commented 1 year ago

Draft PR - https://github.com/department-of-veterans-affairs/vets-api/pull/12898

holdenhinkle commented 1 year ago

New Draft PR (I forgot to branch off of the k8s branch in the last PR) - https://github.com/department-of-veterans-affairs/vets-api/pull/12899

holdenhinkle commented 1 year ago

I broke that single step into two steps, which better matches the steps in master and follows the 'each step should do one thing' rule.

There's 1 test failure now:

Finished in 4 minutes 57.2 seconds (files took 25.14 seconds to load)
2992 examples, 1 failure, 7 pending

Failed examples:

rspec './modules/appeals_api/spec/services/appeals_api/pdf_construction/generator_spec.rb[1:1:3:2:4:1]' # AppealsApi::PdfConstruction::Generator#generate Higher Level Review v3 with max length generates the expected pdf

Randomized with seed 11386

Stopped processing SimpleCov as a previous error not related to SimpleCov has been detected
Coverage report generated for (1/8), (2/8), (3/8), (4/8), (5/8), (6/8), (7/8), (8/8) to /app/coverage. 63778 / 66370 LOC (96.09%) covered.
Tests Failed

15421 examples, 1 failure, 25 pendings

Here are some snippets from failure logs:

       Diff:

       ┌ (Key) ──────────────────────────┐
       │ ‹-› in expected, not in actual  │
       │ ‹+› in actual, not in expected  │
       │ ‹ › in both expected and actual │
       └─────────────────────────────────┘

       -                                                                            WWWWWWWWWWWWWWWWW\n
       +                                                                            WWWWWWWWWWWWWWWWW W W W W W W W W W W W\n

Stack trace:

     Shared Example Group: "shared HLR v2 and v3 generator examples" called from ./modules/appeals_api/spec/services/appeals_api/pdf_construction/generator_spec.rb:228
     # /usr/local/bundle/cache/ruby/3.2.0/gems/super_diff-0.10.0/lib/super_diff/rspec/monkey_patches.rb:43:in `handle_failure'
     # ./modules/appeals_api/spec/services/appeals_api/pdf_construction/generator_spec.rb:183:in `block (6 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/webmock-3.18.1/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:124:in `block in run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `loop'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:110:in `run'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec_ext/rspec_ext.rb:12:in `run_with_retry'
     # /usr/local/bundle/cache/ruby/3.2.0/gems/rspec-retry-0.6.2/lib/rspec/retry.rb:37:in `block (2 levels) in setup'

Finished in 4 minutes 57.2 seconds (files took 25.14 seconds to load)
2992 examples, 1 failure, 7 pending

I looked through lots of the code for generating PDFs and didn't see any differences between the branches.

Then I noticed that the pdftk package is included in Dockerfile in the master branch but not in the k8s branch. pdftk was removed from Debian and Ubuntu repositories due to dependency issues. So we can't simply add it back. However, there is an alternative to pdftk called pdftk-java, which is a reimplementation of pdftk in Java, which can be installed directly from the Debian repository. It should work as a drop-in replacement for pdftk. Trying that now - https://github.com/department-of-veterans-affairs/vets-api/pull/12899/commits/22ff8d41a8709b3f80444eb1dc8c326cff8576c1

holdenhinkle commented 1 year ago

Here's an issue I found about replacing pdftk - https://github.com/department-of-veterans-affairs/va.gov-team/issues/3032

holdenhinkle commented 1 year ago

Nope - that didn't work - https://github.com/department-of-veterans-affairs/vets-api/actions/runs/5195409462 That test is still failing.

holdenhinkle commented 1 year ago

It appears that the issue is with the placeholder (the "W"s) for the zip code field. I'll continue this tomorrow.

holdenhinkle commented 1 year ago

Thank you @ericboehs - https://github.com/department-of-veterans-affairs/vets-api/pull/13049

department-of-veterans-affairs / va.gov-team

Failing K8s tests #58410

Description