pmeier / pytest-results-action

Summarize `pytest` test results in GitHub Actions
BSD 3-Clause "New" or "Revised" License
13 stars 7 forks source link

Feature Request: Truncate the summary to 1024k #9

Open mthrok opened 1 year ago

mthrok commented 1 year ago

Thanks for the work. We started using this in torchaudio, but facing the issue where GITHUB_STEP_SUMMARY exceeds 1024k. An option to truncate or break down would be nice.

Example: https://github.com/pytorch/audio/actions/runs/5149682682

Screenshot 2023-06-01 at 8 05 47 PM

https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#step-isolation-and-limits

Job summaries are isolated between steps and each step is restricted to a maximum size of 1MiB. Isolation is enforced between steps so that potentially malformed Markdown from a single step cannot break Markdown rendering for subsequent steps. If more than 1MiB of content is added for a step, then the upload for the step will fail and an error annotation will be created. Upload failures for job summaries do not affect the overall status of a step or a job. A maximum of 20 job summaries from steps are displayed per job.

pmeier commented 1 year ago

Thanks for the report. We hit the same in TorchVision before. Usually happens if you make change that crashes CI so hard that 1MB log is not sufficient.

pmeier commented 1 year ago

Looking at the logs, you "only" have 723 failures here. However, traceback for a single failure looks like this:

________________________ TestInfo.test_vorbis_8000_2_0 _________________________

  a = (<torchaudio_unittest.backend.dispatcher.ffmpeg.info_test.TestInfo testMethod=test_vorbis_8000_2_0>,)
  kw = {}

      @wraps(func)
      def standalone_func(*a, **kw):
  >       return func(*(a + p.args), **p.kwargs, **kw)

  ../env/lib/python3.10/site-packages/parameterized/parameterized.py:620: 
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
  torchaudio_unittest/backend/dispatcher/ffmpeg/info_test.py:175: in test_vorbis
      sox_utils.gen_audio_file(
  torchaudio_unittest/common_utils/sox_utils.py:81: in gen_audio_file
      subprocess.run(command, check=True)
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

  input = None, capture_output = False, timeout = None, check = True
  popenargs = (['sox', '-V3', '--no-dither', '-R', '--rate', '8000', ...],)
  kwargs = {}
  process = <Popen: returncode: 2 args: ['sox', '-V3', '--no-dither', '-R', '--rate', '8...>
  stdout = None, stderr = None, retcode = 2

      def run(*popenargs,
              input=None, capture_output=False, timeout=None, check=False, **kwargs):
          """Run command with arguments and return a CompletedProcess instance.

          The returned instance will have attributes args, returncode, stdout and
          stderr. By default, stdout and stderr are not captured, and those attributes
          will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them,
          or pass capture_output=True to capture both.

          If check is True and the exit code was non-zero, it raises a
          CalledProcessError. The CalledProcessError object will have the return code
          in the returncode attribute, and output & stderr attributes if those streams
          were captured.

          If timeout is given, and the process takes too long, a TimeoutExpired
          exception will be raised.

          There is an optional argument "input", allowing you to
          pass bytes or a string to the subprocess's stdin.  If you use this argument
          you may not also use the Popen constructor's "stdin" argument, as
          it will be used internally.

          By default, all communication is in bytes, and therefore any "input" should
          be bytes, and the stdout and stderr will be bytes. If in text mode, any
          "input" should be a string, and stdout and stderr will be strings decoded
          according to locale encoding, or by "encoding" if set. Text mode is
          triggered by setting any of text, encoding, errors or universal_newlines.

          The other arguments are the same as for the Popen constructor.
          """
          if input is not None:
              if kwargs.get('stdin') is not None:
                  raise ValueError('stdin and input arguments may not both be used.')
              kwargs['stdin'] = PIPE

          if capture_output:
              if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
                  raise ValueError('stdout and stderr arguments may not be used '
                                   'with capture_output.')
              kwargs['stdout'] = PIPE
              kwargs['stderr'] = PIPE

          with Popen(*popenargs, **kwargs) as process:
              try:
                  stdout, stderr = process.communicate(input, timeout=timeout)
              except TimeoutExpired as exc:
                  process.kill()
                  if _mswindows:
                      # Windows accumulates the output in a single blocking
                      # read() call run on child threads, with the timeout
                      # being done in a join() on those threads.  communicate()
                      # _after_ kill() is required to collect that and add it
                      # to the exception.
                      exc.stdout, exc.stderr = process.communicate()
                  else:
                      # POSIX _communicate already populated the output so
                      # far into the TimeoutExpired exception.
                      process.wait()
                  raise
              except:  # Including KeyboardInterrupt, communicate handled that.
                  process.kill()
                  # We don't call process.wait() as .__exit__ does that for us.
                  raise
              retcode = process.poll()
              if check and retcode:
  >               raise CalledProcessError(retcode, process.args,
                                           output=stdout, stderr=stderr)
  E               subprocess.CalledProcessError: Command '['sox', '-V3', '--no-dither', '-R', '--rate', '8000', '--null', '--channels', '2', '--compression', '0', '/tmp/tmpa7tf2_um/torchaudio_unittest.backend.dispatcher.ffmpeg.info_test.TestInfo.test_vorbis_8000_2_0/data.vorbis', 'synth', '1', 'sawtooth', '1']' returned non-zero exit status 2.

  ../env/lib/python3.10/subprocess.py:526: CalledProcessError
  ----------------------------- Captured stderr call -----------------------------
  sox -V3 --no-dither -R --rate 8000 --null --channels 2 --compression 0 /tmp/tmpa7tf2_um/torchaudio_unittest.backend.dispatcher.ffmpeg.info_test.TestInfo.test_vorbis_8000_2_0/data.vorbis synth 1 sawtooth 1
  sox:      SoX v14.4.2

  Input File     : '' (null)
  Channels       : 1
  Sample Rate    : 8000
  Precision      : 32-bit

  sox FAIL formats: no handler for file extension `vorbis'

Most of that is just noise and result from the (IMHO terrible) default traceback formatting from pytest.

You currently don't seem to use a pytest.ini configuration file. I would advice to use something like TorchVision or Torch. The important line here is --tb=native although --tb=short might also work as a middle ground.

pmeier commented 1 year ago

I've opened pytorch/audio#3401 to showcase the shorter traceback.

pmeier commented 1 year ago

Just so we are clear, being able to truncate the logs to avoid GHA "crashing" is still a good feature request and I'll work on adding it.