facebook / flipper

A desktop debugging platform for mobile developers.
https://fbflipper.com/
MIT License
13.33k stars 953 forks source link

iOS network observer has thoroughly broken swizzling #1674

Closed LeoNatan closed 3 years ago

LeoNatan commented 3 years ago

šŸ› Bug Report

The iOS network observer has thoroughly broken swizzling; when it calls the original delegate method, it sends the incorrect selector, causing errors

Screen Shot 2020-11-11 at 22 25 23

_cmd here is incorrect. It should be URLSession:dataTask:didReceiveResponse:completionHandler:, but as you see, it's something else. This is broken behavior, and breaks code that use the well known pattern [proxyObj respondsToSelector:_cmd].

To Reproduce

Put a breakpoint in any NSURLSession delegate method. Print _cmd.

Environment

iOS 13.5 and iOS 14.2

LeoNatan commented 3 years ago

This is most likely happening in other places, where similar careless swizzling behavior has been used.

mweststrate commented 3 years ago

@LeoNatan we are not use all iOS versions internally, so when the APIs change we might need to add a rule more to the swizzling. Would you mind creating a PR with what the correct swizzling behavior should be?

LeoNatan commented 3 years ago

I will take a look to see if I can come up with a PR. Thanks

arnauddorgans commented 3 years ago

Any updates? :)

LeoNatan commented 3 years ago

Sorry, I have not had the chance to take a look.

LeoNatan commented 3 years ago

I tried looking today, but unfortunately the system is too broken and too (needlessly) complex. I will not be able to provide a fix. The entire swizzling method here is broken. It even breaks calls to _methodDescription once FLEXNetworkObserver is initialized. Please note that this is blocking Detox and Detox Instruments, and I will just swizzle out injectIntoAllNSURLConnectionDelegateClasses as a workaround, meaning users will be unable to use FLEXNetworkObserver with Detox.

LeoNatan commented 3 years ago

You can easily reproduce by running

react-native init Bla

Then after calling InitializeFlipper, add the following line:

NSLog(@"%@", [self valueForKey:@"_methodDescription"]);

You will see that this crashes the system now. If you comment out the InitializeFlipper call, it works as expected.

LeoNatan commented 3 years ago
Screen Shot 2020-12-20 at 13 15 55
mweststrate commented 3 years ago

@LeoNatan I just tried with our reference project (https://github.com/facebook/flipper/tree/master/react-native/ReactNativeFlipperExample), and network interception seems to work fine for me on iOS 14.2 + React Native 0.63.4. You might want to make sure you are using the latest FlipperSDK by using use_flipper!({ 'Flipper' => '0.69.0' }) in your Podfile, rather than just use_flipper!.

Edit: sorry, I think I have misinterpreted the original issue, the problem is not that it isn't working, but that it isn't working correctly when other interceptors are set up? In that case a PR would be much appreciated indeed :)

LeoNatan commented 3 years ago

The problem is incorrect swizzle logic. It's breaking the Objective C runtime, as well as other swizzle attempts. Unfortunately, I don't have time to investigate it. It's too complex, and too broken, sorry.

mweststrate commented 3 years ago

Ok thanks for the update. In that case closing for now as we don't have an actionable / runnable problem to fix for now. cc @priteshrnandgaonkar in case he has further thoughts on this.

LeoNatan commented 3 years ago

I gave you an easy way to reproduce the runtime issue. I gave you a description of how it breaks swizzling in the original report. Iā€™m sorry, but doing bad swizzling, asking for me to fix it and then just closing the issue is bad product management, especially if you push this product to all RN users. For Detox, which the main RN repo uses for tests, we practically had to disable your broken product so to be able to function. But you closed the issue for ā€œno actionable problemā€. šŸ‘

CC @hramos

mweststrate commented 3 years ago

Hi @LeoNatan, a screenshot generally doesn't suffice as reproducible problem. Maybe it is broken badly or maybe not (sorry, I have like zero experience in iOS development), but as far as I can see our core functionality is working. The setup might be wrong indeed (I can't judge that), but the urgency of fixing this is unclear to me so far. So I could leave it open and have it dangling for ever in a "we probably should look into this", but my take on it is that closing it is a clearer signal it is probably not going to be fixed anywhere since this is the only report so far.

@priteshrnandgaonkar curious about your 2 cents on this

LeoNatan commented 3 years ago

I have provided all the details that an iOS engineer needs to debug the issues, not just a screenshot. But the screenshots are enough to show the problem also. I assume you have iOS engineers that work on the product? They are the ones that should judge the provided information, with all due respect. If something is missing, Iā€™ll be happy to provide.

As to urgency, we recently released Detox 18, which had to provide custom code to disable your product. Is that acceptable to you? Detox is the most widely used end to end testing framework for React Native software.

mweststrate commented 3 years ago

@LeoNatan our resident iOS engineer is on PTO so that was I was momentarily looking into it, since this was open for a while, so sorry for that :) For reference the signals on which I closed this were:

  1. As far as I can see there is in this thread little information with which can verify the correctness of any potential fix (I might miss something obvious, but I'd expect something like a minimal project with Detox installed and some instructions, so that we can verify both Flipper and Detox are working correctly)
  2. "Unfortunately, I don't have time to investigate it. " Sounded to me like this wasn't blocking you at the moment and you weren't vested in having a solution, hence I concluded the low prio for now :).
  3. I missed that this affects all Detox projects, not just yours. Sorry about that!

I leave it open for now until someone knowledgable can look into it and thanks again for reporting.

d4vidi commented 3 years ago

Hey @mweststrate! As a fellow Detox maintainer, I have to stick to Leo's approach that while there's no immediate issue with flipper itself, fixing integration issues with other products is as important. All the more so, when we're discussing Detox. Be advised, Detox itself is used as the de-facto regression tool for RN releases -- it could create a real jam-in-wheels case IMO, if not fixed. For now, we've applied an ugly workaround that disables flipper; It's a matter of time till that backfires.

LeoNatan commented 3 years ago

I cannot provide insights on how to fix the problem if I haven't been able to dive in your code. I took a quick look, twice on two different occasions, but it's just too complex for me to have a quick solution. Again, I don't believe asking an issue opener to provide the solution, and if not, closing the issue, is a good product management.

Also, I'm not a React Native developer. I develop Detox for such developers. I have no idea how important flipper is to RN developers. For me, disabling flipper was the obvious solution for Detox, for now. Again I ask, is this an acceptable solution for you?

priteshrnandgaonkar commented 3 years ago

@LeoNatan , Just curious why are you depending on _cmd? According to apple docs its a hidden variable. Not sure if it is meant to be used. Can't you use [superclass instanceRespondsToSelctor: @selector(...)] ?

I checked the swizzling code and we have followed this https://nshipster.com/method-swizzling/ . Apparently using method_exchangeImplementations changes the underlying selector of the method.

Seems like method_setImplementation will be better for swizzling. https://blog.newrelic.com/engineering/right-way-to-swizzle/#:~:text=Swizzling%20is%20the%20act%20of,with%20another%2C%20usually%20at%20runtime.

priteshrnandgaonkar commented 3 years ago

I am working on changing the Swizzling logic to method_setImplementation. This should solve the problem

LeoNatan commented 3 years ago

The _cmd issue is just a small part, that I worked around by using the explicit selector of each method. But then I noticed further issues, and when trying to debug those, noticed that your swizzling code breaks the runtime, meaning incorrect method information has been supplied.

The way we swizzle is take the user's provided delegate object, create a new class pair in runtime, add some methods to this new class pair and finally replace the object's class using object_setClass. This is what's called "ISA swizzling".

https://github.com/wix/DetoxSync/blob/9721ffe8fc33c5608a423b0360ec8ef6de8cd908/DetoxSync/DetoxSync/Spies/NSURLSession%2BDTXSpy.m#L87

However, I noticed that if your network swizzles are enabled, some delegate methods would not be called by NSURLSession internals. I then looked at your code, and noticed that you iterate over all classes in the runtime and add a bunch of methods to them in runtime. I was unable to debug further, because this process is enough to break the ObjC runtime (see https://github.com/facebook/flipper/issues/1674#issuecomment-748593536). At this point, I just gave up and disabled your network injector.

Your entire approach to swizzling in this project is broken, in addition to technically not swizzling correctly. Instead of injecting methods to all classes in the runtime, you should use ISA swizzling to specifically target only the user's delegate class, and then, you correctly need to determine if the user's object would respond to each injected delegate method or not; if yes, call the super implementation, otherwise trigger the default behavior.

LeoNatan commented 3 years ago

Swizzling is a difficult task. See our swizzling helper for a correct implementation: https://github.com/wix/DTXObjectiveCHelpers/blob/master/DTXSwizzlingHelper.h https://github.com/wix/DTXObjectiveCHelpers/blob/04e1130c9c361d74091e18be4b8b66850c391462/DTXSwizzlingHelper.h#L38 https://github.com/wix/DTXObjectiveCHelpers/blob/04e1130c9c361d74091e18be4b8b66850c391462/DTXSwizzlingHelper.h#L99

Our approach is based on https://github.com/rentzsch/jrswizzle

priteshrnandgaonkar commented 3 years ago

noticed that your swizzling code breaks the runtime, meaning incorrect method information has been supplied.

How can I repro this ? I reproduced your initial bug with just our sample objc app and put the delegate method and logged _cmd, and I was able to see the different current selector, which was expected as we kept the Method struct as it is and swapped its underlying implementation. But since its hidden variable one should not depend on it.

Our network swizzling logic is taken from https://github.com/FLEXTool/FLEX .

some delegate methods would not be called by NSURLSession internals. I then looked at your code,

Which are those methods ? Any repro steps for this will be helpful. I can fix our swizzle logic for those methods.

and noticed that you iterate over all classes in the runtime and add a bunch of methods to them in runtime

I think you are talking about this. We have a list of selector methods and check at runtime if any classes implement those methods, if so we swizzle the implementation of those classes.

LeoNatan commented 3 years ago

You can reproduce the broken runtime with the instructions in this comment: https://github.com/facebook/flipper/issues/1674#issuecomment-748593741

_cmd is not a hidden variable, itā€™s a documented feature of the language. Just like self. Changing that variable is breaking the language, just like it would be for changing the self variable.

The entire swizzling strategy is broken. For example, if a class is constructed in runtime, your strategy would fail to catch those classes.

You can reproduce the delegate call missing by linking https://github.com/wix/DetoxSync in an example project, where flipper is enabled. Make sure to disable this code: https://github.com/wix/DetoxSync/blob/9721ffe8fc33c5608a423b0360ec8ef6de8cd908/DetoxSync/DetoxSync/ReactNativeSupport/DTXReactNativeSupport.m#L191 Then perform a fetch request to any URL that returns data. You will see that the data returned is zero bytes, because the NSURLSession delegate method for data was not called. This is due to incorrect assumptions in your swizzling implementations.

priteshrnandgaonkar commented 3 years ago

Regarding

NSLog(@"%@", [self valueForKey:@"_methodDescription"]);

I checked on non-react-native project with flipper and it worked and didn't crash. My hunch is that DetoxAppDelegateProxy might be mysteriously doing something. As in your comment I see DetoxAppDelegateProxy in the callstack.

LeoNatan commented 3 years ago

In my example, I printed the delegate because that was set as the NSURLSession delegate. You can use a different object, and the result will be similar (for example, the RN network delegate).

LeoNatan commented 3 years ago

DetoxAppDelegateProxy is a proxy object that does nothing suspicious. It listens to app lifecycle events. Without your swizzling, the method description call works.

priteshrnandgaonkar commented 3 years ago

_cmd is not a hidden variable,

See this apple doc. It is clearly written that The _cmd variable is a hidden argument passed to every method that is the current selector

LeoNatan commented 3 years ago

So is self. That doesnā€™t mean itā€™s private. See https://developer.apple.com/documentation/objectivec/1456712-objc_msgsend

op The selector of the method that handles the message.

LeoNatan commented 3 years ago

You even see it used in the Apple documentation. What more could you want?

priteshrnandgaonkar commented 3 years ago

But self is explicitly mentioned as an argument. Apple has specifically written _cmd as an hidden argument.

LeoNatan commented 3 years ago

I donā€™t understand what this argument is meant to accomplish? Your swizzle breaks the language contract, and you are arguing that 8 shouldnā€™t use a language feature that is well documented in Appleā€™s own documentation.

LeoNatan commented 3 years ago

hidden argument passed to every method that is the current selector

That alone shows you have broken it.

priteshrnandgaonkar commented 3 years ago

I am arguing about what contract is broken ?

https://github.com/FLEXTool/FLEX/issues/480#issuecomment-723713779

hidden argument passed to every method that is the current selector

Current selector shows the selector which calls the implementation. In the case of swizzling we call back the implementation from swizzled selector so the current selector will be the swizzled one, which is correct. The contract of swizzling is that it should call back the implementation. bytw even detox uses method_exchangeimplementation which will cause the very same problem if some one logs cmd after your proxy.

One shouldn't rely on _cmd, thats what I am arguing. Prefer @selector over _cmd.

priteshrnandgaonkar commented 3 years ago

bytw I logged NSLog(@"RN %@", [self valueForKey:@"_methodDescription"]); after InitializeFlipper(application); after fresh installation of RN project and it worked without crash.

priteshrnandgaonkar commented 3 years ago

@LeoNatan

To further work on the issue regarding the missing delegate calls, I would like to have the exact repro steps. Like what commands I have to type in.

LeoNatan commented 3 years ago

Let me build a reproduction project and I'll post here.

LeoNatan commented 3 years ago

Here we go:

  1. Crashing on _methodDescription: Crashing.zipā€”there is nothing here added from Detox, so no swizzling from Detox, no DetoxAppDelegateProxy;
  2. Delegate method not called: DelegateNotCalled.zipā€”here, I have linked DetoxSync (part of Detox), and with flipper enabled, the URLSession:dataTask:didReceiveData: method is not called

With the InitializeFlipper(application); line, this is logged:

2021-01-05 07:15:27.014251+0200 DelegateBreakage[39921:3202499] ā€¼ļø Got response
2021-01-05 07:15:27.014763+0200 DelegateBreakage[39921:3202496] ā€¼ļø Complete with error: (null)

Without:

2021-01-05 07:13:56.945532+0200 DelegateBreakage[39841:3200873] ā€¼ļø Got response
2021-01-05 07:13:56.945716+0200 DelegateBreakage[39841:3200873] ā€¼ļø Got data
2021-01-05 07:13:56.945973+0200 DelegateBreakage[39841:3200863] ā€¼ļø Complete with error: (null)
priteshrnandgaonkar commented 3 years ago

@LeoNatan

I am working on what is causing the crash when we call NSLog(@"%@", [self valueForKey:@"_methodDescription"]); on the class with swizzled methods. The crashing.zip was helpful with regards to reproducing the issue.

I haven't checked DelegateNotCalled.zip yet, as I have already tested URLSession:dataTask:didReceiveData on the pure objc project without RN with Flipper enabled and it works. Anyways I will see if there is any issue from our side, but I hardly feel there might be.

LeoNatan commented 3 years ago

But the entire issue is that youā€™re swizzle breaks when DetoxSync is linked. If you donā€™t even bother looking at the issue, why did I work on reproducing it?

priteshrnandgaonkar commented 3 years ago

I will be looking at it, I never said I won't look at it. My comment was my initial view. bytw I am able to fix the crash issue.

priteshrnandgaonkar commented 3 years ago

But the entire issue is that youā€™re swizzle breaks when DetoxSync is linked

Not sure abt this, I have to look through the repro.

priteshrnandgaonkar commented 3 years ago

@LeoNatan Here is the PR which fixes the crash. I am now looking at your missing delegate repro.

LeoNatan commented 3 years ago

So it was as expected, incorrect types. Thanks

priteshrnandgaonkar commented 3 years ago

@LeoNatan, I looked at your repro for the other issue. What happens is that DetoxSync swizzles the methods of the AppDelegate first, Flipper swizzle logic does get called for URLSession:dataTask:didReceiveData: and it does try to call the original implementation which for flipper turns out to be from your swizzled class __detox_sync_URLSessionDelegateProxy, But there is no implementation for URLSession:dataTask:didReceiveData: in __detox_sync_URLSessionDelegateProxy class and thats where it gets lost. The reason why other methods are called is because there is an implementation of those method in that proxy class.

If I remove detoxsync and normally hit an API then I do get the callback. I do not know how the detoxsync is setup and its initialization code in RN and in DelegateNotCalled.zip.

priteshrnandgaonkar commented 3 years ago

I would like to know how URLSession:dataTask:didReceiveData is swizzled in DetoxSync. When I put a breakpoint with flipper disabled I am taken to here, which is a dispatch block, I would like to know how the dispatch block gets scheduled and who calls that c function.

LeoNatan commented 3 years ago

That's the point, it's not swizzled by DetoxSync. These are the only two methods that are swizzled: https://github.com/wix/DetoxSync/blob/9721ffe8fc33c5608a423b0360ec8ef6de8cd908/DetoxSync/DetoxSync/Spies/NSURLSession%2BDTXSpy.m#L17

We don't even swizzle methods, we ISA swizzle the instance. We create a subclass of the user's delegate class, add these two methods that are linked, and we replace the class of the object with the new subclass. That's a completely legitimate way of working with the runtime. It's done correctly with the runtime methods. Somehow, the flipper swizzles break this.

priteshrnandgaonkar commented 3 years ago

We create a subclass of the user's delegate class, add these two methods that are linked, and we replace the class of the object with the new subclass

But then how does the other method's(URLSession:dataTask:didReceiveData) gets called ? If you just add a subclass then the methods like URLSession:dataTask:didReceiveData should directly be called without detox_sync getting involved, but that doesn't happen, as I end up at the breakpoint here

I tried printing NSStringFromClass([cls superclass]) when cls is __detox_sync_URLSessionDelegateProxy and it printed NSObject. I expected it to be the subclass of AppDelegate as you said in the above comments.

LeoNatan commented 3 years ago

The URL session system calls it, and like with any inheritance, the method is implemented by the object. Thatā€™s what is broken in flipper, it doesnā€™t correctly forward the call to the delegate object from the swizzle.

I didnā€™t understand what you are trying to check with the second comment. Can you please elaborate? Thanks

priteshrnandgaonkar commented 3 years ago

Flipper tries to call the implementation on a class with type __detox_sync_URLSessionDelegateProxy, but it doesn't have all the method implementations. And as you said you just add this class as a subclass to the delegate class, I expected its superclass to be an AppDelegate as in your repro Appdelegate is the delegate for NSURLSession

priteshrnandgaonkar commented 3 years ago

It will be better if you can abstract out the minimal code from detoxsync to repro this, as it looks quite complex and I don't know when and where the initialization of detoxsync happens.

I don't seem to triage to any issues in the flipper side regarding the missing delegate calls, as Flipper does try to call the delegate but the dynamically added detox class doesn't seem to have the implementation.