Closed simme closed 2 years ago
@simme Thanks for the report!
If I understand correctly, one of your sets (setA
, the one that's passed to intersect
as an argument in the crashing case) has somehow become corrupted before the intersection
call: the hash table contained an index with the value 1389, while there were only 1388 elements in the storage array.
The two leading candidates for explaining an off-by-one error like this are (1) a race condition where two threads are mutating the hash table at once, and they clobber each other's updates, or (2) a logic error in OrderedSet where a mutating method fails to correctly renumber hash table contents. OrderedSet
operations are thoroughly tested, but the code hasn't been used in production enough to rule out (2); at the same time, Swift 5 also makes it way too easy to accidentally make data races like (1).
It's difficult to tell from the report which of these two candidates is the real cause (if either). To help narrow this down, could you tell a little more about your code?
Tagged<Foo, UUID>
with UUID
coming from Foundation? (I'd like to rule out an invalid Hashable
implementation, or an item being mutated after insertion. Bad hashing can't lead to an out-of-bounds access, but knowing for certain whether hashing is implemented well will help narrow this down.)OrderedSet
variable across multiple threads if all accesses are read-only, or if all accesses are carefully synchronized.)intersection
? (Do you know which one?))If this turns out to be a data race, then running your app with ASan enabled will probably pinpoint the issue.
Otherwise defining the COLLECTIONS_INTERNAL_CHECKS
build condition may catch the bogus mutation implementation, at the cost of considerably slowing down operations. (To generate a deterministic reproducer, it may be useful to also disable hash seed randomization by setting the COLLECTIONS_DETERMINISTIC_HASHING
build condition and running the app with the SWIFT_DETERMINISTIC_HASHING
environment variable set to 1
.)
(0.0.7 has the same OrderedSet implementation as main -- there is no reason to spend effort on reproducing this there.)
Removing the 1.0 milestone. A quick audit uncovered no obvious issues within the package. If this turns out to be a package issue, then the fix can ship in a patch release as soon as we track it down!
_checkInvariants
as public
, and calling it from more places.Thanks for taking a look!
Tagged<AppModels.Product, Foundation.UUID>
. Tagged
conforms to Hashable
whenever it's RawValue
does, and the implementation is synthesized.keys
property of an OrderedDictionary
. The OrderedDictionary
or the type that wraps it is long lived within my code. It sees one big insertion of the 1300 items when they are loaded from the database. Then one single insertion that triggers the crashing code when I diff the new set with a copy of the previous version. The big insertion from the database is manually dispatched to the main queue. There are no removals. The IdentifiedArray
is initialized using this initializer which seems to call out to this initializer to create its internal OrderedDictionary
.Ran the app with Adress Sanitizer turned on (and "Detect use of stack after return"). Never done that before, but didn't see any warnings in the console or in Xcode.
Tried running with the build conditions. Not sure if I did it right though.
Sorry if this is not of much help. Let me know if there's anything else I can do to attempt to pin-point the issue. I understand it's very likely that the cause is something I'm doing or any of the layers between me and the corrupted set.
This is good information, thanks! OrderedDictionary
goes around OrderedSet
's public API in some places; this gives some more leads for me to follow.
The build settings look good to me! The issue could be nondeterministic, so it's possible it will only occur in a small fraction of runs. (Disabling deterministic hashing may help reproduce it more often, especially if you are using the same data across runs -- it could well be that the hashes just happen to align well with the deterministic seed. That said, the number of possible storage permutations in a 1.3k set is immense, and it's technically possible that the pattern that triggers the logic error (if it's that) may never happen again.)
Ok, good that it was to some use! :)
The crash happens 100% of the time with my data. So in that sense it is deterministic.
I'll see if I can find some time to setup a reproduction case with my actual data. Kinda in crunch mode for a 1.0 release right now, so a little tied up though.
The crash happens 100% of the time with my data. So in that sense it is deterministic.
Huh; that makes it extremely likely this is a logic error in the package. If there's any way you can extract a reliable reproducer, then that would likely immediately show me the fix! Meanwhile, I'll investigate more.
(If you use an xcworkspace in your app project, one relatively easy way to test unreleased package changes is to clone #107 to a local repo and to add its folder to the workspace -- Xcode will pick up the local repo instead of using the regular dependency resolution algorithm.)
I have not forgotten about this. Just a little tied up at the moment, as I mentioned.
The instance of Collections that I'm using is a dependency, to a dependency, to a dependency at the moment. If I add my own dependency on Collections in my Package.json
, will that version be used by my dependencies as well? Not if it's pinned to another version, right?
That's perfectly understandable! I haven't seen other reports of this issue, so I'm treating it as a worrying problem that could escalate into something serious, rather than as an active emergency.
(Given that you can readily reproduce this, running the 1.0.0 tag with COLLECTIONS_INTERNAL_CHECKS enabled might be enough to pinpoint the operation that triggers the corruption. (v1.0 includes some improvements to internal checking.))
The instance of Collections that I'm using is a dependency, to a dependency, to a dependency at the moment. If I add my own dependency on Collections in my
Package.json
, will that version be used by my dependencies as well? Not if it's pinned to another version, right?
With Xcode's package support, the most straightforward way I know of overriding the regular dependency resolution process is to add a local clone of the package directly to the workspace (like Utils/swift-collections.xcworkspace
does for our benchmarks). AIUI this forces all (direct and indirect) dependencies to pick up the package from the linked folder rather than through the regular process. (Though it is possible @neonichu would correct me or suggest better alternatives.)
I have been trying to reproduce this in a smaller sample project to no avail. It is still 100% reproducible in my main project, even using version 1.0.1 of Collections. I'd be willing to send you the code for my entire app through some private means and walk you through how to trigger the issue if you think that'd help!
Hi folks, I’ve been seeing a similar issue via the IdentifiedArray type that wraps an OrderedDictionary
and the crash log looks really similar as well, with 21 of 23 lines here being the same: https://github.com/pointfreeco/swift-identified-collections/issues/26#issuecomment-943193186
Poking around in the stacktrace a bit and it seems like at some point _UnsafeHashTable._find(_:in:)
tries to subscript the elements at an _offset
of 31, while the ordered dictionary in question has 31 elements. Interestingly in my case, I can only reproduce this crash with this particular set of data. This is for an app where I’m adding a message to a list, and only this one particular list produces this crash; any time I add a message here it crashes, adding a message to any other thread does not. This is with an n of ~5, so presumably there are other crashing values too; my point here is for some reason only particular bits of data cause the crash.
I’m trying to make a small reproducible test case, which is made a bit difficult by the fact that pretty much the same code does not crash in another code path; i’m adding the reply to the list after it’s accepted by the server. When I relaunch the app and attempt to sync again, it sends back an array with the same exact value, and this time the code does not crash, which has me a bit befuddled. Adding a fresh reply after this initial merge however does cause the same crash, and so I’m not really sure what’s happening here.
I realise this probably isn’t very helpful without a reproducible sample but until I can have that, please do let me know if I can help in any other way.
@simme I'd be glad to help debug this -- you can DM me on the forums or Twitter to set things up.
@HarshilShah That sounds like hash seed randomization is making the issue nondeterministic, which is very much expected, if this is an issue with the hash table implementation. Setting SWIFT_DETERMINISTIC_HASHING=1 would likely help with the reduction effort!
Ah nice, I’ll try to enable that flag and see what happens!
In a small learning project I can reproduce this or a similar crash in a 100% reliable manner. I cloned the swift-collections repo and added it all into a xcworkspace as recommended by @lorentey.
Calling state.cards._dictionary._checkInvariants()
directly after state.cards.insert(card, at: 0)
succeeds. But the next SwiftUI view update crashes. Calling po self.cards._dictionary._checkInvariants()
immediately before the crash results in a precondition failure.
OrderedCollections/OrderedSet+Invariants.swift:28: Precondition failed: Index 0 not found in hash table (element: FA9A6EA9-7E4C-40B2-A8C0-4D3ACC5660E0)
2021-11-07 16:55:16.885554+0100 SwipePreview[13530:5205563] OrderedCollections/OrderedSet+Invariants.swift:28: Precondition failed: Index 0 not found in hash table (element: FA9A6EA9-7E4C-40B2-A8C0-4D3ACC5660E0)
error: Execution was interrupted, reason: EXC_BREAKPOINT (code=1, subcode=0x18f786a00).
After the insert there's 30 items. However in the view update the inserted item is not present anymore. There's only 29 items.
This turned out to be caused by a missing uniqueness check in a couple of OrderedSet
mutation operations, leading to a violation of value semantics.
Embarrassingly, this wasn't caught by the existing tests that check value semantics, because they only checked that iteration over old snapshots still provided consistent results, not that those snapshots were fully self-consistent. D'oh!
The fix is #123; this will ship in a new release very soon.
Big shoutout to @KaiOelfke for providing me with a reproducer!
That's great! Once the path is live I'll verify in my app that the issue has been fixed. Sorry for dropping the ball on this.
It looks like the PR link did not trigger -- manually marking this as resolved.
1.0.2 will be released later today; please verify & reopen if this still triggers there.
If have two
OrderedSet
s of what are essentially UUIDs (wrapped in Tagged). Finding theintersection
fromsetA
works.But taking the two, same, sets and flipping the order causes an out of bounds crash in RandomAccessCollection+Offsets.swift#28.
The items in the sets are 1389 vs. 1388 if that could somehow matter.
Here's a stack trace if it helps:
Information
Checklist
main
branch of this package.Did not attempt to reproduce on
main
asswift-collections
is a sub-dependency on one of my dependencies. But could absolutely try if requested. Getting late where I am 😅Steps to Reproduce
I have not been able to reproduce this issue unfortunately. I'm doing this a on a few different pairs of sets, and only one of them with this specific type (
Tagged<Product, UUID>
— product being a customstruct
) causes the issue. The other sets are alsoTagged<X, UUID>
.Happy to attempt to provide more attempts to reproduce. The inner workings of all this is a bit over my head though.
Expected behavior
I expect the "call order" to not matter.
Actual behavior
Out of bounds crash.