iabudiab / HTMLKit

An Objective-C framework for your everyday HTML needs.
MIT License
239 stars 27 forks source link

Occasional Internal Consistency Error #36

Closed bcholmes closed 5 years ago

bcholmes commented 5 years ago

Every once in a while, I get a crash on an "internal consistency" error.

My app is fetching some HTML pages in a background process and pulling out some interesting stuff. A few times, I've had the app (in the simulator) crash on the following:

2019-07-13 21:30:37.026686-0400 dreamwidth[65452:90258127] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '*** -[NSHashTable NSHashTable {
[10] <HTMLNodeIterator: 0x60000eae1680>
}
] count underflow'
*** First throw call stack:
(
    0   CoreFoundation                      0x000000010af6a6fb __exceptionPreprocess + 331
    1   libobjc.A.dylib                     0x00000001098abac5 objc_exception_throw + 48
    2   CoreFoundation                      0x000000010af6a555 +[NSException raise:format:] + 197
    3   Foundation                          0x000000010827f9d7 hashProbe + 407
    4   Foundation                          0x000000010827fe5c -[NSConcreteHashTable removeItem:] + 49
    5   myapp                               0x0000000106d01e3d -[HTMLDocument detachNodeIterator:] + 93
    6   myapp                               0x0000000106d115b7 -[HTMLNodeIterator dealloc] + 87

I suspect that it's something to do with timing: it doesn't happen all the time, but when it does, it crashes my app. I suspect that the HTML document is being deallocated when this happens.

I've tried to avoid iterating on, say, .childNodes to avoid creating node iterators, but the crash still happens every so often.

iabudiab commented 5 years ago

@bcholmes Hey there, thanks for the bug report.

It looks like the iterator is removed twice, once by being deallocated and once by the detach call.

I'll try to reproduce and fix it accordingly. And it would be really helpful, if you could narrow down the steps for me, like what is exactly happening in the app right before the crash.

bcholmes commented 5 years ago

I've had my first instance of this error in a few days, and I think I've gleaned a bit more info.

Here's a bit more from that stack trace:

(
    0   CoreFoundation                      0x000000010c69d6fb __exceptionPreprocess + 331
    1   libobjc.A.dylib                     0x000000010afd6ac5 objc_exception_throw + 48
    2   CoreFoundation                      0x000000010c69d555 +[NSException raise:format:] + 197
    3   Foundation                          0x0000000109a939d7 hashProbe + 407
    4   Foundation                          0x0000000109a93e5c -[NSConcreteHashTable removeItem:] + 49
    5   myapp                               0x0000000108ba99cd -[HTMLDocument detachNodeIterator:] + 93
    6   dreamwidth                          0x0000000108bb9147 -[HTMLNodeIterator dealloc] + 87
    7   libobjc.A.dylib                     0x000000010afe972c _ZN11objc_object17sidetable_releaseEb + 202
    8   myapp                               0x0000000108bb5deb -[HTMLNode firstElementMatchingSelector:] + 667
    9   myapp                               0x0000000108bb5a5b -[HTMLNode querySelector:] + 107

(I'm not really sure that all stack traces have had that same callchain).

If appears that the failure is happening during some query selectors. Now, in the current version of the code, I'm parsing the HTML document, I do a tiny bit of querying, and then I start copying some interesting data to a Core Data object, which I manipulate in a NSManagedObjectContext performBlock block. Most of my query selector calls are happening in that block.

So my first guess is that the crash happens because of some race condition: the thread that parses the document and the thread that grabs data from the DOM to copy into Core Data are tripping each other up.

I'm going to try testing for a bit with a code restructure where I handle all document parsing and query selecting in the context of the Core Data perform block.

bcholmes commented 5 years ago

So it's been a bit over a week, and I haven't seen a crash since I reorganized the code. I'm inclined to think that it's was a threading issue, and it could be avoided by keeping my documents in a single thread.

iabudiab commented 5 years ago

I'm almost 100% sure it's a threading issue. I still can't reproduce this consistently, in order to pinpoint the exact location and combination of conditions that lead to this bug.

Will keep you posted.

iabudiab commented 5 years ago

@bcholmes The last commit should fix the issue, however I can't be 100% sure, since I could not reproduce this consistently. I would really appreciate it, if you could test this with your previous code, that crashed and report back 😉

If the crash still happens, please reopen this.

bcholmes commented 5 years ago

Cool. I'll let you know if I see anything.