bloomberg / memray

Memray is a memory profiler for Python
https://bloomberg.github.io/memray/
Apache License 2.0
13.17k stars 392 forks source link

snapshot? tests are faillings #654

Closed Sigmanificient closed 1 month ago

Sigmanificient commented 2 months ago

Hi, nixpkgs (nixos package repository) contributer here. I am currently working on updating the memray package, as requested by on of our user:

However, in this process, I've encountered some failling tests:

===== 13 failed, 508 passed, 8 skipped, 4 deselected, 4 warnings in 54.41s =====
Executing pytestCheckPhase
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0
rootdir: /build/source
configfile: pyproject.toml
plugins: textual-snapshot-0.4.0, syrupy-4.6.1
collected 533 items / 4 deselected / 529 selected                              

tests/integration/test_api.py .......                                    [  1%]
tests/integration/test_attach.py ssssssss                                [  2%]
tests/integration/test_extensions.py .........                           [  4%]
tests/integration/test_greenlet.py ..                                    [  4%]
tests/integration/test_ipython.py ..........                             [  6%]
tests/integration/test_native_tracking.py ...........                    [  8%]
tests/integration/test_processes.py ....                                 [  9%]
tests/integration/test_socket.py ...........                             [ 11%]
tests/integration/test_threads.py ...                                    [ 12%]
tests/integration/test_tracing.py ...........................            [ 17%]
tests/integration/test_tracking.py ..................................... [ 24%]
........................................................................ [ 37%]
........                                                                 [ 39%]
tests/test_utils.py .......                                              [ 40%]
tests/unit/test_allocation_lifetime_aggregator.py ...................    [ 44%]
tests/unit/test_attach.py .                                              [ 44%]
tests/unit/test_cli.py ................................................. [ 53%]
..........................                                               [ 58%]
tests/unit/test_flamegraph_reporter.py ................................. [ 65%]
...                                                                      [ 65%]
tests/unit/test_frame_tools.py .........................                 [ 70%]
tests/unit/test_high_water_mark_aggregator.py .......................... [ 75%]
......                                                                   [ 76%]
tests/unit/test_highwatermark_command.py ..............                  [ 79%]
tests/unit/test_reader.py .....                                          [ 79%]
tests/unit/test_stats_reporter.py ..........                             [ 81%]
tests/unit/test_summary_reporter.py .....                                [ 82%]
tests/unit/test_table_reporter.py .....                                  [ 83%]
tests/unit/test_templates.py ......                                      [ 84%]
tests/unit/test_tracker.py ..                                            [ 85%]
tests/unit/test_transform_reporter.py ...........                        [ 87%]
tests/unit/test_tree_reporter.py ........................FFFFF.F.        [ 93%]
tests/unit/test_tui_reporter.py ......................FFFF...F...FF      [100%]

=================================== FAILURES ===================================

I noticed the Update textual-snapshot files to new format commit, applied it on top of the 1.13.4 release source, but it also fails:

=========== 21 failed, 500 passed, 8 skipped, 4 deselected in 57.72s ===========
Executing pytestCheckPhase
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.2.2, pluggy-1.5.0
rootdir: /build/source
configfile: pyproject.toml
plugins: textual-snapshot-0.4.0, syrupy-4.6.1, cov-5.0.0
collected 533 items / 4 deselected / 529 selected                              

tests/integration/test_api.py .......                                    [  1%]
tests/integration/test_attach.py ssssssss                                [  2%]
tests/integration/test_extensions.py .........                           [  4%]
tests/integration/test_greenlet.py ..                                    [  4%]
tests/integration/test_ipython.py ..........                             [  6%]
tests/integration/test_native_tracking.py ...........                    [  8%]
tests/integration/test_processes.py ....                                 [  9%]
tests/integration/test_socket.py ...........                             [ 11%]
tests/integration/test_threads.py ...                                    [ 12%]
tests/integration/test_tracing.py ...........................            [ 17%]
tests/integration/test_tracking.py ..................................... [ 24%]
........................................................................ [ 37%]
........                                                                 [ 39%]
tests/test_utils.py .......                                              [ 40%]
tests/unit/test_allocation_lifetime_aggregator.py ...................    [ 44%]
tests/unit/test_attach.py .                                              [ 44%]
tests/unit/test_cli.py ................................................. [ 53%]
..........................                                               [ 58%]
tests/unit/test_flamegraph_reporter.py ................................. [ 65%]
...                                                                      [ 65%]
tests/unit/test_frame_tools.py .........................                 [ 70%]
tests/unit/test_high_water_mark_aggregator.py .......................... [ 75%]
......                                                                   [ 76%]
tests/unit/test_highwatermark_command.py ..............                  [ 79%]
tests/unit/test_reader.py .....                                          [ 79%]
tests/unit/test_stats_reporter.py ..........                             [ 81%]
tests/unit/test_summary_reporter.py .....                                [ 82%]
tests/unit/test_table_reporter.py .....                                  [ 83%]
tests/unit/test_templates.py ......                                      [ 84%]
tests/unit/test_tracker.py ..                                            [ 85%]
tests/unit/test_transform_reporter.py ...........                        [ 87%]
tests/unit/test_tree_reporter.py .....................FFFFFFFFFFF        [ 93%]
tests/unit/test_tui_reporter.py ......................FFFFFFFF...FF      [100%]

=================================== FAILURES ===================================

I am not sure what this means for memray usability (are those ttest critical?), so I am opening this issue to get more insight

godlygeek commented 2 months ago

These tests are fragile, and it's probably nothing to worry about, but it's hard to say anything for certain with only the output you've pasted. It's hard to even be sure that what failed is only the snapshot tests without seeing the "snapshot report summary" section of the pytest output, though we do have 21 snapshot tests and they are all in those two files, which makes it pretty likely that it is only the snapshot tests that failed.

What version of Textual are you using?

Here's what I can tell you without seeing the snapshot report:

  1. Memray 1.13.4 is believed to work properly with any version of Textual >= 0.41.0, but the snapshot tests will certainly not pass with Textual < 0.73; the snapshots frequently need updating due to changes in how Textual and pytest-textual-snapshot render the snapshots, rather than any true regression.
  2. The snapshot tests that exist in the v1.13.4 tag should pass with pytest-textual-snapshot < 1 but are expected to fail with pytest-textual-snapshot >= 1. f6e4039a80ac57847956218bea6934c88208c1f7 flips that situation: after applying that commit, the snapshot tests should pass with pytest-textual-snapshot >= 1, but fail with older versions. I see textual-snapshot-0.4.0 in your pytest output, so applying f6e4039a80ac57847956218bea6934c88208c1f7 is definitely not helpful to you.
  3. When a snapshot test fails, it pops out an HTML report that you can view to check what the differences are. Investigating that would help us tell if this is anything to worry about.
  4. If you're testing with Textual < 0.73, running pytest with --snapshot-update --snapshot-warn-unused (setting them in the PYTEST_ADDOPTS environment variable might be easiest) will cause the snapshots themselves to be ignored, but will still run the snapshot tests to make sure that they don't crash or deadlock or anything bad like that. It just skips the diff step at the end, and succeeds regardless of any differences (updating the snapshot files on disk to new versions).
Sigmanificient commented 2 months ago

Wow thanks for the detailed explanations! Using textual 0.72.0 & pytest-textual-snapshot 0.4.0, so the patch should definitively be removed.

Nixpkgs have some restriction, so locking textual and pytest-textual-snapshot might be a bit complex. From my understanding, these tests can be turned off on our side without causing too much problem

godlygeek commented 2 months ago

Yeah, if you're stuck with Textual 0.72.0, I'd just run the tests with --snapshot-update --snapshot-warn-unused and confirm that they pass that way. It won't actually be comparing the snapshots to make sure they match (since they won't), but it's good enough to prove there isn't something serious wrong, and pytest should exit with a good rcode.

godlygeek commented 2 months ago

When you run the test suite, how do you invoke it? We're wondering if there's a way to change the default behavior for you so that snapshot comparisons aren't run, while leaving them on for maintainers and CI.

Sigmanificient commented 2 months ago

We can disable test based on there name and path like so:

  disabledTests = [
    # Import issue
    "test_header_allocator"
    "test_hybrid_stack_of_allocations_inside_ceval"

    # snapshot-based tests are too fragile
    # see https://github.com/bloomberg/memray/issues/654
    "TestTUILooks"
    "test_tui_basic"
    "test_tui_pause"
    "test_tui_gradient"
    "test_merge_threads"
    "test_unmerge_threads"
  ];