airsdk / Adobe-Runtime-Support

Report, track and discuss issues in Adobe AIR. Monitored by Adobe - and HARMAN - and maintained by the AIR community.
197 stars 11 forks source link

iOS ANE: Arguments passed mixed up on iPad iOS 15+ only(?) #2079

Open rainbowcreatures opened 1 year ago

rainbowcreatures commented 1 year ago

Hello there, I'm an ANE developer(it's a video recording & encoding extension) which has been stable for years at this point, on iOS in particular. However, from one of my clients I've got a report that their app started crashing, but only only iOS 15 and above, and only on iPad.

This has been really weird to me, as from my past experience whatever worked on iPhone, worked on iPads and vice versa as I believe the iOS should be identical(?).

I don't have the physical devices, but I said I'll try at least with simulators. And to my surprise, I was able to reproduce a crash - when I tested with iPhone 8 / iOS 15 simulator, my test app worked flawlessly. When I tested with iPad mini (6th generation) / iOS 15 simulator, I got a crash.

This is what I discovered: The crash was caused by AVAssetWriter getting an empty video URL. This was totally unexpected, as I'm constructing the string in the AS3 portion of the ANE and so I know something should always get passed to the native portion of the ANE.

Upon closer inspection I noticed that other args passed to the ANE function (via ctx.call) were mixed up. Never saw anything like that...to me, it almost looked like some sort of buffer overflow. I made sure that the string I've been passing has the same length on both the iPad and iPhone simulator, and that is the case (length reported is 246).

image (9) image (10)

As you can see on the screenshots, on the iPhone simulator, everything is passed fine. The exact same app on iPad simulator gets a zero length string instead(I checked and NSLogged the string length, just to make sure). The rest of the args are mixed up, like recordToWav says 44100, that should have been the sampleRate though. Now, I'm not 100% sure this is the same issue my client is experiencing, but its pretty suspicious.

Any idea why the args could get mixed up like that, and why just on iPad on iOS 15+?

marchbold commented 1 year ago

I'd suggest checking how you read the FREObject representing your string, might be incorrectly converting it to an NSString?

rainbowcreatures commented 1 year ago

Here's how:

image

It will fail at the point of NSLog so I'm not including the rest. But just on one platform. On iPhone simulator it will say "Length of the filename string: 246", but on iPad mini simulator "Length of the filename string: 0". I don't think there's anything wrong on my end in this function?

Also please note that the other args are mixed up (ie. it's not just this one arg being an issue). Almost feels things were handed over "wrong" into argv[]. When I check the string on AS3 side before handing it over, the length is correct though (246), on both iPhone and iPad mini sims.

So it appears something goes wrong between sending the string from AS3 and recieving it in native code. Or, am I reading this wrong?

marchbold commented 1 year ago

What's it returning? Normally you should also check whether the FREGet succeeds.

eg:

uint32_t length;
const uint8_t *value;
if (FRE_OK == FREGetObjectAsUTF8( object, &length, &value))
{
    NSString* objectString = [NSString stringWithUTF8String:(char*)value];
}
else 
{
    // handle error
}

Seems odd that the order is changing, how are you calling it in your as3?

ajwfrost commented 1 year ago

Yes I was going to ask what the return value there would be... https://help.adobe.com/en_US/air/extensions/WSb464b1207c184b14-62b8e11f12937b86be4-7fe6.html

But this sounds like a very strange one .. are the other values okay do you know, i.e. 'argc' is what you would expect? I had been wondering if there's some padding going on with the function arguments which implies a change in ABI and something that's not being handled by the linker.. I'm assuming you're building your application on a mac though (which means we'd be using the Apple linker directly) and as you say, seems odd for this to be an issue only on one device/OS combination. Presumably the device is a 64-bit one so I would have expected the 'argc' argument to take up 8 bytes even though it's a 32-bit value, but all of this stuff should just be processed by the normal Apple build tools.

Can I also check: do you see anything like this in any other native calls? e.g. if you create a new app that uses the ANE and just makes this one call, does it still happen? And if you create a new ANE with just one similar function exposed, does the error still happen? (and if it's got to that point and you can send us the sample code/ane, we can see if we can reproduce it and work out what's happening..)

thanks

rainbowcreatures commented 1 year ago

Hey guys, thanks for your feedback!

I'll start by checking the return value, will share what it says.

Yeah I'm compiling on Mac, not Windows, though it's a remote virtual machine, but hopefully doesn't play a role.

Also @ajwfrost I'll check out argc - I didn't have time doing any additional digging yet, so I don't have additional info from any potential tests you're suggesting (which sound all good). I was trying to see if this rings a bell over here first and maybe if I was missing something really obvious.

Otherwise I was planning to go along that route you're suggesting, ie creating a minimal test case (empty ANE/ empty app), see if I'm lucky there, if not then proceeding the opposite way, ie. start stripping down things from the app & ANE until I hit the point when it starts working. Then hopefully, something rings a bell, but if not then I'd definitely appreciate more help and will send the ane / code.

rainbowcreatures commented 1 year ago

Ok, here are the results of the FREResult and argc, hopefully I'm logging those right. So argc is 1 less on iPad mini simulator for some reason, and the result of the FREGetObjectAsUTF8 seems to be FRE_OK. The expected number of args is 17, so iPhone has that correct.

image image image

Does this help with anything? If not, I'll proceed with more tests later on.

ajwfrost commented 1 year ago

Thanks .. I think that's just really weird! I might see if one of the guys here can put together a test case and try this on the simulator devices.

rainbowcreatures commented 1 year ago

Cool, thanks! Yeah it is a bit mysterious. I hope I'm not causing this by whatever bug buried somewhere for years, brought to surface by some recent AIR / iOS changes...that would be really skillful of me. Based on what I saw so far, it seemed to point to something inside AIR vs iOS, as something seemingly nukes the args on the way from AS3 to native code...though I might still be triggering the initial conditions for that to happen. I'm not doing much before init in the native code, so at least it should be easy to pinpoint if that's the case. I'll try doing tests towards that direction.

Just FYI...I've got a log from the client's real iPad right now, and it confirms it's the same issue like I'm getting on the simulator...so it's not limited to the simulator (but can be reproduced there, which is nice).

image

marchbold commented 1 year ago

I would be interested to see your as3 call too? Just we aren't seeing this behaviour in our extensions so wondering what the difference is, or, worse, if I've just missed it during testing :)

rainbowcreatures commented 1 year ago

Well it's good to know you aren't seeing it! I'm totally not ruling out it's me triggering some weird chain of reaction of issues culminating in this :) Though it will be still interesting to know what it was. Here's the part of the code leading up to the call:

if (platform == 'IOS' || platform == 'MAC') {
        var appDir:File = File.applicationStorageDirectory;
        var fileString:String = appDir.nativePath;
        appDir.preventBackup = true;
        videoFilePath = fileString + "/video.mp4";
        mergedFilePath = fileString + "/" + mobileFilename;
        videoFile = appDir.resolvePath(videoFilePath);
        mergedFile = appDir.resolvePath(mergedFilePath);
        if (videoFile.exists) videoFile.deleteFile();
        if (mergedFile.exists) mergedFile.deleteFile();                         
        _ctx.call('fw_ffmpeg_init', videoFilePath, videoWidth, videoHeight, 1, video_fps, iOS_nativeQuality, bitrate, keyframe_freq, buffer_freq, stage_fps, 0, audio_sample_rate, audioChannels, audio_bit_rate, int(realtime), int(audio), audioCapture);                 
}

Here are the data types being passed

iOS_nativeQuality:int;
videoWidth:Number;
videoHeight:Number;
video_fps:Number;
buffer_freq:Number;
keyframe_freq:Number;
bitrate:int;
stage_fps:Number;
audio_sample_rate:int;
audio_bit_rate:int;
realtime:Boolean;
audio:Boolean;
audioCapture:int;

Honestly I do not recall why I've been mixing Numbers and ints, it has been years since I wrote this initial code.

marchbold commented 1 year ago

How are you determining the platform? Be aware that you may be getting iPadOS for some of the newer iPads ;)

marchbold commented 1 year ago

What happens if it isn't iOS there? does it fall through to a different call with a different params?

rainbowcreatures commented 1 year ago

Lol I discovered what you wrote simultaneously...I've been starting to pass just 1 arg right now, and its still showing I'm passing 16. There are different init calls based on platform. So 100% it's called an init for different platform, explaining all the weirdness.

rainbowcreatures commented 1 year ago

Ok, guys, it's working now, case closed. Thanks for all of your help!

image

I've been using Capabilities.os.indexOf, looking for "iPhone" string for years - then setting my own platform string. As I haven't been in touch with AIR for couple years, it didn't occur to me this might not be enough for those newer iPads you mention. So in the end it was what seemed most likely - an outside change, though different than I expected ;)

marchbold commented 1 year ago

Great to hear! That os name change has caught a few people recently. :)

rainbowcreatures commented 1 year ago

Glad I'm not alone :) Btw. @marchbold you've been doing a great job with your ANE's, I believe I purchased a couple back in the day 👍

ajwfrost commented 1 year ago

Thanks guys - glad to hear that this was relatively simple in the end! Not 100% sure where we get the capabilities string from but below is what I see on the simlators with a Capabilities dump...

image

Looks like the "manufacturer" string is what we'd control based on our build settings in case you need something common across them.

cheers

marchbold commented 1 year ago

I think it just caught everyone by surprise as iOS has been common across iPhone's and iPad's (even iPod's) for so long. But Apple's recent push to distinguish iPad's as "computer's" has led to this change. Mind you by recent I mean it has been like this for a year or so ;)

rainbowcreatures commented 1 year ago

Hey guys, sorry to be resurrecting this, I thought I was done but apparently I'm not. Having some health issues so it took me some time to get back to this.. While I'm sure I fixed the above issue, after I built the ANE for the device for my client, I've got reports that it is crashing on all iOS devices (not just iOS 15) :|

It is working fine on the simulator for me. Again, never experienced issues when building this in ancient xCode for ancient iOS versions, so this is new.

They sent me device logs, and I can see the extension starting to log(so it's not totally broken). It logs the init string with the correct args this time (proving that part was fixed). But after that, it logs absolutely nothing other than what's leading to the crash.

 Aug 29 17:28:28 iPad Creato[2022] <Notice>: [FlashyWrappers] Filename: /var/mobile/Containers/Data/Application/FC8732BA-01AE-41BF-8391-68D7ADBE250A/Library/Application Support/com.tz.touchzing.creato.status.video.maker.photo.story/Local Store/video.mp4, w: 960 h: 1280 fast 1 fps: 24 quality: 2 bitrate: 5000000 keyframe_interval: 24 buffer_length: 0 stage_fps: -1 recordToWAV: 0 sampleRate: 44100 channels: 2 audioBitrate: 65536 realtime: 0 audio: 1 audioCapture: 0
Aug 29 17:28:28 iPad kernel[0] <Notice>: Creato[2022] Corpse allowed 1 of 5
Aug 29 17:28:28 iPad backboardd(QuartzCore)[65] <Error>: ImageQueueCollectable client message err=0x10000003
...
...
 Aug 29 17:28:28 iPad ReportCrash(CoreAnalytics)[231] <Notice>: Sending event: com.apple.stability.crash {"bundleID":"com.tz.touchzing.creato.status.video.maker.photo.story","bundleVersion":"1.2.0","exceptionCodes":"0x0000000000000001, 0x0000000000000001(\134n    1,\134n    1\134n)EXC_BAD_ACCESSSIGSEGVKERN_INVALID_ADDRESS at 0x0000000000000001","incidentID":"10191B41-5B13-4EF6-9EFF-266BCE14E144","logwritten":1,"process":"Creato","terminationReasonExceptionCode":"0xb","terminationReasonNamespace":"SIGNAL"}

For reference, here is a screenshot of the part of the code where you can see the first log (which is logged fine - underscored green), and then the next log I expect to see, but it never comes(underscored red). So the crash probably occurs between those two log lines. 90% of the code there are IMO harmless assignments, so I suppose (though it's not 100% sure) the crash would be around the views stuff.

image

I have been googling the platform macro as well to make sure something didn't change (again). Only figured out I supposed to be using TARGET_OS_IOS probably, but I don't think that would lead to issues. This code is there to switch between OS X / iOS code.

Another theory is some frameworks were packaged wrong for distribution (because on simulator there are no issues), and as soon as I touch them, it crashes. So before I investigate any further, I'm hoping you might know about a change within the last couple of years which might cause something to be aware of when packaging an ANE in the later xCode. Or maybe the EXC_ error rings a bell pointing me in the right direction.

Thanks!

ajwfrost commented 1 year ago

I think you'd be right about it being something up with the views .. you're using some deprecated things there e.g. keyWindow..

This property holds the UIWindow object in the windows array that is most recently sent the makeKeyAndVisible message.

(although, a quick check of the AIR code and it does look like we're calling makeKeyAndVisible - or at least, that code is there, not sure whether it's called...)

But if you're able to reproduce it, can you at least split all those calls into single accesses and null-check each time? Sorry but nothing really springs to mind in terms of frameworks etc, - perhaps it's just worth ensuring your ANE is built with the same iPhoneOS SDK version as we used for the AIR SDK?

thanks

marchbold commented 1 year ago

How early are you calling this? Is it possible that the rootView is nil, i.e. before AIR has finished initialising the view? That would potentially result in a crash like this.

rainbowcreatures commented 1 year ago

Thanks again for your ideas guys.

I'm unable to repro as I don't have any physical iOS devices at the moment. I can start the process with my client remotely, where I give them a debug version to log some stuff, they send the logs back etc. So yeah I guess that's the only way forward now.

It works perfectly fine on my iOS simulator. Also, when they trying my older ANE (built with old XCode / AIR etc. couple years back) it does work fine even on new iPhones. Though to be fair I'm not sure which AIR version they used to build it all together.

The init is called pretty late, usually you call it when you want to start recording a video from your app. Meaning, the app had to be initialized at that point, and usually it's tied to some button tap.

I haven't changed this part of code for like 6 years, never had issues, so its a bit puzzling (again).

rainbowcreatures commented 1 year ago

Otherwise I believe I'm using iPhone OS SDK 15, but I'll double check.

ajwfrost commented 1 year ago

If there's a simple app that can be used to test this, then I can build and run it on an iPhone here .. I've got iOS16 now :-(

Also, when they trying my older ANE (built with old XCode / AIR etc. couple years back) it does work fine even on new iPhones.

So if we can find out whether they're using the same AIR SDK to build with this old ANE that works, as with the new ANE that crashes, it would help. I wondered whether there was an impact from the splash screen code i.e. the fact we're now using a launch storyboard, may cause changes in this area perhaps...? but if the same app and AIR runtime/SDK are behaving differently with the old vs new ANE libraries, then it's more likely to be some weird compatibility thing.

thanks

marchbold commented 1 year ago

All I can say is that we use [[[UIApplication sharedApplication] keyWindow] rootViewController] and the view property on that view controller regularly so I wouldn't expect that was the issue.

I wonder if it's actually the log statement crashing? ie. the conversion from UIView to %@?

EXC_BAD_ACCESS generally means you have some invalid memory reference. so like calling a method on a nil reference or something like that. Could also have to do with ARC?

rainbowcreatures commented 1 year ago

Hey guys, sorry for the delay again. Had some string of health issues, I hope I'm fine for now!

@marchbold That thought crossed my mind as well, after seeing the recent logs from my client. I've been in the process of sending an ANE with more detailed logs... I saw the first batch and the actual crash doesn't happen in the area of the view assignments. I'm still trying to find out what gets assigned there exactly(I've made a lousy job of adding the logs while not feeling well, so I forgot to log rootView, logged just rootViewController). I sent another ANE to them with even more detailed logs today.

That conversion to %@ also caught my eye - I don't remember at all why I logged that and if it's the proper way to do it. I had little to no Obj-C programming knowledge when developing the ANE, I literally just googled what I thought I needed at the time so it's highly likely I could have done something irregular.

Now when I know it doesn't crash in the view assignments, yet I don't get that "Root view" log, it occured the log might be the actual point of crash. Now of course, is it crashing because the value I'm trying to log is wrong (which would be the bigger issue). And why did this work fine before.

Right now I'm waiting for another log batch - if I'm still unsure I might take you on that offer @ajwfrost , thanks :) Will try to create a simple test case and make my client confirm it's also crashing. Will make sure it's not crashing on iPad simulator too. Then I might send you that test. Condolences re iOS 16, I haven't been in that ecosystem for some time.

marchbold commented 1 year ago

%@ works for most objects but I'd be cautious with a view object and I prefer to be a little clearer about what's being logged than an arbitrary object. Definitely if it is nil it would crash there. Why it worked previously could just be that the internal conversion of that object to %@ may have changed or they may have decided it was incorrect and thrown an error.

Let us know how you go with your logging. And I hope your health improves.

rainbowcreatures commented 1 year ago

@marchbold Thanks, it's coming from my minimal knowledge of Obj-C and I just wanted to get things done at that time. No doubt many things I'm doing there are not super proper. The secondary question I'd have is is, why is it (potentially) null out of sudden. My health is fluctuating, right now it's better again, thanks.

So the latest is, I will share the project here as I'm still not really sure what is causing those crashes. After I've added some more logs it is crashing around the line where I've been actually trying to log *subviews assignment.

It's true that I did not try a variant where I'd erase all of the logs. But what really doesn't help is that I don't have a physical iOS device and it's just working perfectly fine on iOS simulator. My client is getting slightly tired of this (no wonder) so their last reply took them about 10 days. I'm sure it's all my fault but it would be really great if you could check it out @ajwfrost (since you generously offered, but of course anyone else can). Here's a zip archive:

https://www.dropbox.com/s/reg8fun9vatfboh/fw.zip?dl=0

There's bin/2.6/examples/exampleCamera/ which I've been using to test as sort of minimal example, and is confirmed to be crashing on my clients iOS devices (all of them), while the older xCode / AIR builds didn't crash (except on iPad and we know why now). It contains .fla file which I didn't open for quite a long time but hopefully should work. Also contains a command line script (for OS X) to build & launch the example. Because of it's age, there were some minor issues but it works now for iOS simulator. I couldn't check the real device iOS branch.

I also included xCode project files for the ANE(Apple/xCode) plus the command line script I'm using to build the ANE (buildane_Apple).

In xCode/ANEGeneric/FWVideoEncoder.mm , line 649 onwards (you know this method from screenshots) is where the crash occurs.

The way I build the ane:

_buildane_Apple full <iOS/iOSsimulator>

You might need to modify some obvious variables, like AIR SDK path inside the script. This builds the ANE into iOS/ane/ or iOS_simulator/ane. Then I copy this into bin/2.6/lib/AIR/ (a step I could have automated for sure) so the example users the latest. And then I proceed with launching the exampleCamera.

But hey, maybe you won't need any of that as it will become clear what I messed up just by opening the xCode project :) Also FYI I know the platform.xml have outdated sdkVersion and ios_version_min. I tried to set both to 15, rebuild the ane and send to my client, but it didn't help. Right now I'm hoping that you might see what's wrong - if not I might need to resort to downgrade xCode, AIR etc. to the old version I used last time(pre iOS 15 SDK) and rebuild. Though that doesn't seem like a long term solution. This SDK is otherwise pretty much dead, I'm not selling or maintaining it (last sale was like last year). Just trying to support those couple clients who bought it last yearand are still stuck with it.

If you need to clarify anything please let me know, thanks!

ajwfrost commented 1 year ago

Hi @rainbowcreatures .. thanks for that. I downloaded the zip file and managed to build it on xcode, A few minor updates needed to get it to build on the latest xcode version, mostly just around the fact it's arm64 only. Then the example application compiled and ran fine.. it went past that trace (I updated it to ensure it was my build that was being executed!) and everything seemed to work okay. Lots of frames being added at time 0.0000 though, don't know if that's just a trace thing because in my phone's Photos app I can see the snippets of video that I took (with the "webcam capture demo" borders above/below the actual video).

So this all looks good to me.... built with latest Xcode and macOS versions, iOS 16 SDK and running on iPhone 11 with 16.0.2.

Curious... I can send you the ANE build with this if you think it would help?

thanks

rainbowcreatures commented 1 year ago

Hey @ajwfrost thanks for spending your time on that! Sounds amazing. Please definitely do send me the ANE, I'll forward it to my client and see what are their results.

I'm not sure about those frames. It's odd that it works fine for you of course, without any changes...other than the minor updates, but I guess as you said all related to the fact that you had to update to the latest xCode.

rainbowcreatures commented 1 year ago

Hello @ajwfrost, just pinging again, not sure if you noticed my previous message. Thanks!