apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.89k stars 4.27k forks source link

Remove nondeterminism from expansion tests #33082

Closed damccorm closed 2 weeks ago

damccorm commented 2 weeks ago

These depend on https://github.com/apache/beam/blob/47740d07ed9e0824b4975ca4cadece74aae38302/sdks/python/apache_beam/transforms/external.py#L1007 which can lead to errors like:

    def test_classpath(self):
      with tempfile.TemporaryDirectory() as temp_dir:
        try:
          # Avoid having to prefix everything in our test strings.
          oldwd = os.getcwd()
          os.chdir(temp_dir)
          # Touch some files for globing.
          with open('a1.jar', 'w') as _:
            pass

          service = JavaJarExpansionService(
              'main.jar', classpath=['a*.jar', 'b.jar'])
>         self.assertEqual(
              service._default_args(),
              ['{{PORT}}', '--filesToStage=main.jar,a1.jar,b.jar'])
E             AssertionError: Lists differ: ['{{P[13 chars]lesToStage=main.jar,a1.jar,b.jar', '--alsoStartLoopbackWorker'] != ['{{P[13 chars]lesToStage=main.jar,a1.jar,b.jar']
E             
E             First list contains 1 additional elements.
E             First extra element 2:
E             '--alsoStartLoopbackWorker'
E             
E             - ['{{PORT}}',
E             -  '--filesToStage=main.jar,a1.jar,b.jar',
E             ?                                        ^
E             
E             + ['{{PORT}}', '--filesToStage=main.jar,a1.jar,b.jar']
E             ? ++++++++++++                                       ^
E             
E             -  '--alsoStartLoopbackWorker']

and overall flakiness like https://github.com/apache/beam/actions/workflows/beam_PreCommit_Python_Coverage.yml?query=event%3Aschedule

This removes the nondeterminism by clearing the cache.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

damccorm commented 2 weeks ago

R: @ahmedabu98 @jrmccluskey

github-actions[bot] commented 2 weeks ago

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers