swiftlang / swift-corelibs-xctest

The XCTest Project, A Swift core library for providing unit test support
swift.org
Apache License 2.0
1.15k stars 267 forks source link

Running test suites under linux with asan shows a memory leak in xctest #342

Closed hassila closed 2 years ago

hassila commented 2 years ago

Running with the 5.6 swift toolchain and related artefacts - this is reproduced on Ubuntu - running

gives the following from xctest:

ubuntu@swift:/home/xyzzy/swift-data-model$ swift test --sanitize address | swift demangle
Compiling plugin Swift-DocC...
Compiling plugin Swift-DocC Preview...
Building for debugging...
[1/3] Emitting module DataModelTests
[2/3] Compiling DataModelTests DataModelTests.swift
[5/8] /home/xyzzy/swift-data-model/.build/aarch64-unknown-linux-gnu/debug/swift-data-modelPackageTests.derived/runner.swift
[6/8] Wrapping AST for DataModelTests for debugging
[7/11] Compiling swift_data_modelPackageTests runner.swift
[8/11] Compiling swift_data_modelPackageTests DataModelTests.swift
[9/11] Emitting module swift_data_modelPackageTests
[12/13] Wrapping AST for swift-data-modelPackageTests for debugging
/usr/bin/ld.gold: warning: Cannot export local symbol '__asan_extra_spill_area'
[13/13] Linking swift-data-modelPackageTests.xctest
Build complete! (3.13s)
Test Suite 'All tests' started at 2022-04-27 18:31:07.420
Test Suite 'debug.xctest' started at 2022-04-27 18:31:07.423
Test Suite 'DataModelTests' started at 2022-04-27 18:31:07.423
Test Case 'DataModelTests.testThatDataModelFailsWriting' started at 2022-04-27 18:31:07.423
Test Case 'DataModelTests.testThatDataModelFailsWriting' passed (0.153 seconds)
Test Suite 'DataModelTests' passed at 2022-04-27 18:31:07.577
     Executed 1 test, with 0 failures (0 unexpected) in 0.153 (0.153) seconds
Test Suite 'debug.xctest' passed at 2022-04-27 18:31:07.577
     Executed 1 test, with 0 failures (0 unexpected) in 0.153 (0.153) seconds
Test Suite 'All tests' passed at 2022-04-27 18:31:07.577
     Executed 1 test, with 0 failures (0 unexpected) in 0.153 (0.153) seconds

=================================================================
==6874==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0xaaaad334683c in malloc /home/build-user/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3
    #1 0xffff9861db58 in operator new(unsigned long) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x9fb58)
    #2 0xffff997ce258 in swift::RefCounts<swift::RefCountBitsT<(swift::RefCountInlinedness)1> >::formWeakReference() (/usr/lib/swift/linux/libswiftCore.so+0x3ea258)
    #3 0xffff9979e938 in swift_weakAssign (/usr/lib/swift/linux/libswiftCore.so+0x3ba938)
    #4 0xffff98951168 in $s6XCTest9XCTWaiterC4wait3for7timeout12enforceOrder4file4lineAC6ResultOSayAA0A11ExpectationCG_SdSbs12StaticStringVSitF (/usr/lib/swift/linux/libXCTest.so+0x3f168)
    #5 0xffff989514a8 in $s6XCTest9XCTWaiterC4wait3for7timeout12enforceOrder4file4lineAC6ResultOSayAA0A11ExpectationCG_SdSbs12StaticStringVSitFZ (/usr/lib/swift/linux/libXCTest.so+0x3f4a8)
    #6 0xffff9893fda8 in $s6XCTest21awaitUsingExpectationyyyyYaKcKF (/usr/lib/swift/linux/libXCTest.so+0x2dda8)
    #7 0xffff9893f37c in $s6XCTest0A4CaseC10invokeTestyyF (/usr/lib/swift/linux/libXCTest.so+0x2d37c)
    #8 0xffff9893f1b0 in $s6XCTest0A4CaseC7performyyAA0A3RunCF (/usr/lib/swift/linux/libXCTest.so+0x2d1b0)
    #9 0xffff989436d4 in $s6XCTestAAC3runyyF (/usr/lib/swift/linux/libXCTest.so+0x316d4)
    #10 0xffff9894194c in $s6XCTest0A5SuiteC7performyyAA0A3RunCF (/usr/lib/swift/linux/libXCTest.so+0x2f94c)
    #11 0xffff989436d4 in $s6XCTestAAC3runyyF (/usr/lib/swift/linux/libXCTest.so+0x316d4)
    #12 0xffff9894194c in $s6XCTest0A5SuiteC7performyyAA0A3RunCF (/usr/lib/swift/linux/libXCTest.so+0x2f94c)
    #13 0xffff989436d4 in $s6XCTestAAC3runyyF (/usr/lib/swift/linux/libXCTest.so+0x316d4)
    #14 0xffff9894194c in $s6XCTest0A5SuiteC7performyyAA0A3RunCF (/usr/lib/swift/linux/libXCTest.so+0x2f94c)
    #15 0xffff989436d4 in $s6XCTestAAC3runyyF (/usr/lib/swift/linux/libXCTest.so+0x316d4)
    #16 0xffff9893dee0 in $s6XCTest7XCTMain_9arguments9observerss5NeverOSayAA0A4CaseCm04testF5Class_SaySS_yAHKctG8allTeststG_SaySSGSayAA0A11Observation_pGtF (/usr/lib/swift/linux/libXCTest.so+0x2bee0)
    #17 0xffff9893da58 in $s6XCTest7XCTMainys5NeverOSayAA0A4CaseCm04testD5Class_SaySS_yAFKctG8allTeststGF (/usr/lib/swift/linux/libXCTest.so+0x2ba58)
    #18 0xaaaad36ef024 in $s28swift_data_modelPackageTests6RunnerV4mainyyFZ /home/xyzzy/swift-data-model/.build/aarch64-unknown-linux-gnu/debug/swift-data-modelPackageTests.derived/runner.swift:10:9
    #19 0xaaaad36ef098 in $s28swift_data_modelPackageTests6RunnerV5$mainyyFZ /home/xyzzy/swift-data-model/.build/aarch64-unknown-linux-gnu/debug/swift-data-modelPackageTests.derived/runner.swift:3:1
    #20 0xaaaad36ef0b0 in main /home/xyzzy/swift-data-model/.build/aarch64-unknown-linux-gnu/debug/swift-data-modelPackageTests.derived/runner.swift
    #21 0xffff98783d4c in __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x20d4c)
    #22 0xaaaad32d3670 in _start (/home/xyzzy/swift-data-model/.build/aarch64-unknown-linux-gnu/debug/swift-data-modelPackageTests.xctest+0x91670)

SUMMARY: AddressSanitizer: 32 byte(s) leaked in 1 allocation(s).
ubuntu@swift:/home/xyzzy/swift-data-model$ 

The test is just empty (after having stripped it out piece by piece to nail down the leaker):

import XCTest

@testable import DataModel
@testable import DataModelExecutable

final class DataModelTests: XCTestCase {

    override func setUp() {
        super.setUp()
    }

    override func tearDown() {
        super.tearDown()
    }

    func testThatDataModelFailsWriting() async throws {
    }
}

On macOS with the xctest running there with Xcode 13.3, I don't get any leak.

hassila commented 2 years ago

See also https://github.com/apple/swift/issues/58482 for a reproducer.

grynspan commented 2 years ago

Based on the stack trace, this looks to me like it's probably a leak in the Swift runtime. @mikeash @al45tair thoughts?

mikeash commented 2 years ago

Looks like this is a side table being leaked. I’d guess this is a false positive due to the lead sanitizer not recognizing whatever weird bit stuffing we’re doing with the pointer to the side table.

al45tair commented 2 years ago

Yes, LSan works by scanning memory, so it isn't going to find pointers where we've stuffed extra bits of data into them. The reason this works on macOS is that the tools there do understand Swift's internal data structures and know what to do to turn bit patterns into pointers there.

hassila commented 2 years ago

Ok, thanks - but where are those diffs? I thought the tool chains on both platforms share source from llvm with the tsan support? Just trying to understand- it'd be great to have tsan/asan working for Linux too for swift on server in heterogeneous environments.

hassila commented 2 years ago

Perhaps this is then not a bug for xctest but for the swift project?

al45tair commented 2 years ago

Sorry, perhaps I wasn't clear. On macOS, people tend to use leaks rather than LSan to find memory leaks. leaks knows about all kinds of macOS specific things, including the weird bit packing we do in the Swift and ObjC language runtimes. I don't know whether anyone has made any attempt to teach LSan any of that; if they have, it may just be that it's only turned on when running on macOS.

Whether that's a Swift bug per se, I'm not so sure (LSan belongs to clang). It isn't an XCTest bug though.

hassila commented 2 years ago

But lsan has been suggested as the go-to tool for Linux / Swift - e.g.

https://forums.swift.org/t/test-for-memory-leaks-in-ci/36526/5

https://github.com/swift-server/guides/blob/main/docs/memory-leaks-and-usage.md

https://github.com/apple/swift-issues/issues/6848

So it seems expected from different Apple engineers that it should work on Linux?

Cc: @gottesmm @weissi @lukasa

al45tair commented 2 years ago

It's the right tool, certainly. If it doesn't presently work for Swift, we should make it work, IMO. The thing to do is to file a bug report against either Swift or Clang (or both); the right people will see it and sort out who does what.

weissi commented 2 years ago

@hassila valgrind/heaptrack/LSan are the best tools we have on non-Darwin platforms today. Unfortunately, the leak checking isn't reliable and cannot easily be made to be reliable. The problem is what @mikeash referred to as 'bit stuffing' but even constructions like

class A {}
class B {}

enum Something {
    case a(A)
    case b(B)
}

which are absolutely ubiquitous in Swift may cause trouble. Why? Because Swift is smart enough to use the spare bits in pointers for things like enum tags. In other words, it uses 'bit stuffing' without the programmer doing anything particularly smart:

  8> MemoryLayout<Something>.size
$R0: Int = 8

These constructions already throw off tools that rely on grooming the heap for live pointers. Sure, they can also mask away the unused bits but sometimes the unused bits being 0 is relied on to even recognise pointers.

Where it most prominently fails is the manual bit stuffing that's used by String and elsewhere.

So how does leaks/heap on macOS work? leaks has much more runtime & programming language information and really understands the heap (as opposed to the tools mentioned above which see the heap just as a see of pointers/malloc blocks that they perform some reachability analysis on -- without any information about the language/runtime). That's why it can give you the correct type information for allocated/leaked blocks and its analyses are actually really good. Sadly, we currently don't have anything comparable on Linux yet.

Don't get me wrong, valgrind/heaptrack/LSan still provide value on Linux but they may raise some false positives. So they'll show genuine leaks but they'll also show some allocations as leaks that aren't actually leaked (still reachable but through some 'bit stuffed' pointers).

And definitely +1 on @al45tair, we should file bugs when we see these tools misreporting. These issues can be improved, it might not necessarily be easy but it's definitely possible.

hassila commented 2 years ago

Thanks for all the replies, but it seems that the fact that it doesn't signal a leak on macOS would point to that LSan is aware of the bit stuffing? Perhaps it's platform specific as suggested.

So I will file issues on Swift/Clang then as a next step, thanks.

al45tair commented 2 years ago

it seems that the fact that it doesn't signal a leak on macOS

Does it not? If so, that's interesting and might mean the the fix is more straightforward.

hassila commented 2 years ago

it seems that the fact that it doesn't signal a leak on macOS

Does it not? If so, that's interesting and might mean the the fix is more straightforward.

No it doesn't, we've set up CI on both platforms and we only get the (false) positives on Linux. Granted, we don't have a huge data set, but it seems to be the case with reasonable certainty.