artsy / eigen

The Art World in Your Pocket or Your Trendy Tech Company's Tote, Artsy's mobile app.
MIT License
3.54k stars 573 forks source link

App launch time increased #586

Closed alloy closed 8 years ago

alloy commented 9 years ago

The LaunchImage is shown way longer than before. I wonder if this could be DYLD loading all the frameworks.

alloy commented 9 years ago

Ouch, yeah, 4.3 sec to load all images :sob:

total time: 4.6 seconds (100.0%)
total images loaded:  290 (235 from dyld shared cache)
total segments mapped: 162, into 2414 pages with 312 pages pre-fetched
total images loading time: 4.3 seconds (93.8%)
total dtrace DOF registration time: 0.14 milliseconds (0.0%)
total rebase fixups:  91,464
total rebase fixups time: 55.97 milliseconds (1.2%)
total binding fixups: 261,574
total binding fixups time: 111.74 milliseconds (2.3%)
total weak binding fixups time: 0.33 milliseconds (0.0%)
total bindings lazily fixed up: 0 of 0
total initializer time: 118.93 milliseconds (2.5%)
    libSystem.B.dylib : 76.47 milliseconds (1.6%)
libBacktraceRecording.dylib : 0.57 milliseconds (0.0%)
       libc++.1.dylib : 0.05 milliseconds (0.0%)
      libobjc.A.dylib : 0.03 milliseconds (0.0%)
       CoreFoundation : 0.87 milliseconds (0.0%)
               vImage : 0.01 milliseconds (0.0%)
     libGLImage.dylib : 0.14 milliseconds (0.0%)
libFosl_dynamic.dylib : 0.02 milliseconds (0.0%)
            CoreImage : 0.01 milliseconds (0.0%)
        CoreTelephony : 0.01 milliseconds (0.0%)
     HockeySDK_Source : 0.56 milliseconds (0.0%)
                Artsy : 30.45 milliseconds (0.6%)
total symbol trie searches:    87386
total symbol table binary searches:    0
total images defining weak symbols:  15
total images using weak symbols:  46
alloy commented 9 years ago

Looks like we’ll have to enable prebinding :sob: The images should have a vmaddr other than 0 and a unique one for each.

For more info see https://developer.apple.com/library/ios/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/FrameworkBinding.html#//apple_ref/doc/uid/20002256-106894

otool -l /Users/eloy/Library/Developer/Xcode/DerivedData/Artsy-dvyztngmkwpjoxercvsdsplrzfzz/Build/Products/Debug-iphoneos/AFNetworking.framework/AFNetworking
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 872
  segname __TEXT
   vmaddr 0x0000000000000000
   vmsize 0x0000000000030000
  fileoff 0
 filesize 196608
  maxprot 0x00000005
 initprot 0x00000005
   nsects 10
    flags 0x0
otool -l /Users/eloy/Library/Developer/Xcode/DerivedData/Artsy-dvyztngmkwpjoxercvsdsplrzfzz/Build/Products/Debug-iphoneos/AFHTTPRequestOperationLogger.framework/AFHTTPRequestOperationLogger 
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 792
  segname __TEXT
   vmaddr 0x0000000000000000
   vmsize 0x0000000000008000
  fileoff 0
 filesize 32768
  maxprot 0x00000005
 initprot 0x00000005
   nsects 9
    flags 0x0
[SNIP]
alloy commented 9 years ago

There’s also the option of lazy loading the frameworks, i.e. on demand, but that’s not automatable.

alloy commented 9 years ago

This prints out the list of vmaddrs of the frameworks in the app after letting DYLD figure it out without prebinding, which could be used to assign the -seg1addr flags to the corresponding framework’s OTHER_LDFLAGS:

#import <mach-o/dyld.h>
#import <inttypes.h>
static void
PrintImageLoadAddresses(void)
{
    const char *frameworks_path = [[[[NSBundle mainBundle] bundlePath] stringByAppendingPathComponent:@"Frameworks"] UTF8String];
    unsigned long frameworks_path_length = strlen(frameworks_path);
    for (uint32_t i = 0; i < _dyld_image_count(); i++) {
        const char *image_path = _dyld_get_image_name(i);
        if (strncmp(image_path, frameworks_path, frameworks_path_length) == 0) {
            NSString *frameworkName = [[NSString stringWithUTF8String:image_path] lastPathComponent];
            intptr_t slide = _dyld_get_image_vmaddr_slide(i);
            printf("0x%" PRIxPTR ": %s\n", slide, [frameworkName UTF8String]);
        }
    }
}
    0x100474000: AFHTTPRequestOperationLogger
    0x100494000: AFNetworking
    0x1004f4000: AFOAuth1Client
    0x100524000: ALPValidator
    0x100560000: ARASCIISwizzle
    0x100588000: ARAnalytics
    0x1005c4000: ARCollectionViewMasonryLayout
    0x1005f0000: ARGenericTableViewController
    0x100610000: ARTiledImageView
    0x100630000: Analytics
    0x100690000: Artsy_UIButtons
    0x1006b0000: Artsy_UIColors
    0x1006d8000: Artsy_UIFonts
    0x1006f8000: Artsy_UILabels
    0x100718000: Bolts
    0x10076c000: CocoaLumberjack
    0x1007d0000: DHCShakeNotifier
    0x1007f8000: DRKonamiCode
    0x100828000: EDColor
    0x100858000: FBSDKCoreKit
    0x100944000: FBSDKLoginKit
    0x100998000: FLKAutoLayout
    0x1009b0000: FODFormKit
    0x100a10000: FXBlurView
    0x100a34000: HockeySDK_Source
    0x100b9c000: ISO8601DateFormatter
    0x100bcc000: InterAppCommunication
    0x100bf0000: JLRoutes
    0x100c10000: JSBadgeView
    0x100c34000: JSDecoupledAppDelegate
    0x100c5c000: KSDeferred
    0x100c74000: Keys
    0x100c88000: MMMarkdown
    0x100cbc000: Mantle
    0x100d08000: MultiDelegate
    0x100d20000: NAMapKit
    0x100d60000: ORKeyboardReactingApplication
    0x100d90000: ORStackView
    0x100db4000: ObjectiveSugar
    0x100de4000: RSSwizzle
    0x100e0c000: ReactiveCocoa
    0x100f10000: SDWebImage
    0x100f7c000: TPDWeakProxy
    0x100f8c000: TRVSDictionaryWithCaseInsensitivity
    0x100fb8000: TSMiniWebBrowser_dblock
    0x100fdc000: UIAlertView_Blocks
    0x101008000: UICKeyChainStore
    0x101028000: UIView_BooleanAnimations
    0x101048000: VCRURLConnection
    0x101084000: iRate
    0x1010c0000: libextobjc
alloy commented 9 years ago

I set the pre bindings with the following Podfile post install hook, but either I’m doing something wrong or prebinding makes no difference in load time :-/

post_install do |installer|
  vmaddrs = <<-EOS
    0x100474000: AFHTTPRequestOperationLogger
    0x100494000: AFNetworking
    0x1004f4000: AFOAuth1Client
    0x100524000: ALPValidator
    0x100560000: ARASCIISwizzle
    0x100588000: ARAnalytics
    0x1005c4000: ARCollectionViewMasonryLayout
    0x1005f0000: ARGenericTableViewController
    0x100610000: ARTiledImageView
    0x100630000: Analytics
    0x100690000: Artsy_UIButtons
    0x1006b0000: Artsy_UIColors
    0x1006d8000: Artsy_UIFonts
    0x1006f8000: Artsy_UILabels
    0x100718000: Bolts
    0x10076c000: CocoaLumberjack
    0x1007d0000: DHCShakeNotifier
    0x1007f8000: DRKonamiCode
    0x100828000: EDColor
    0x100858000: FBSDKCoreKit
    0x100944000: FBSDKLoginKit
    0x100998000: FLKAutoLayout
    0x1009b0000: FODFormKit
    0x100a10000: FXBlurView
    0x100a34000: HockeySDK_Source
    0x100b9c000: ISO8601DateFormatter
    0x100bcc000: InterAppCommunication
    0x100bf0000: JLRoutes
    0x100c10000: JSBadgeView
    0x100c34000: JSDecoupledAppDelegate
    0x100c5c000: KSDeferred
    0x100c74000: Keys
    0x100c88000: MMMarkdown
    0x100cbc000: Mantle
    0x100d08000: MultiDelegate
    0x100d20000: NAMapKit
    0x100d60000: ORKeyboardReactingApplication
    0x100d90000: ORStackView
    0x100db4000: ObjectiveSugar
    0x100de4000: RSSwizzle
    0x100e0c000: ReactiveCocoa
    0x100f10000: SDWebImage
    0x100f7c000: TPDWeakProxy
    0x100f8c000: TRVSDictionaryWithCaseInsensitivity
    0x100fb8000: TSMiniWebBrowser_dblock
    0x100fdc000: UIAlertView_Blocks
    0x101008000: UICKeyChainStore
    0x101028000: UIView_BooleanAnimations
    0x101048000: VCRURLConnection
    0x101084000: iRate
    0x1010c0000: libextobjc
  EOS

  vmaddrs = Hash[*vmaddrs.split("\n").flat_map { |line| line.match(/(0x\w+): (.+)/)[1..2].reverse }]

  installer.pods_project.targets.each do |target|
    path = target.product_reference.path
    if File.extname(path) == '.framework'
      name = File.basename(path, '.framework')
      if vmaddr = vmaddrs[name]
        target.build_configurations.each do |config|
          config.build_settings['OTHER_LDFLAGS'] = "$(inherited) -seg1addr #{vmaddr}"
        end
      else
        puts "[!] No vmaddr found for: #{name}"
      end
    end
  end
end
total time: 4.4 seconds (100.0%)
total images loaded:  290 (235 from dyld shared cache)
total segments mapped: 162, into 2414 pages with 312 pages pre-fetched
total images loading time: 4.2 seconds (94.8%)
total dtrace DOF registration time: 0.14 milliseconds (0.0%)
total rebase fixups:  91,470
total rebase fixups time: 29.95 milliseconds (0.6%)
total binding fixups: 261,576
total binding fixups time: 94.93 milliseconds (2.1%)
total weak binding fixups time: 0.33 milliseconds (0.0%)
total bindings lazily fixed up: 0 of 0
total initializer time: 107.66 milliseconds (2.3%)
    libSystem.B.dylib : 72.95 milliseconds (1.6%)
libBacktraceRecording.dylib : 0.58 milliseconds (0.0%)
       libc++.1.dylib : 0.05 milliseconds (0.0%)
      libobjc.A.dylib : 0.03 milliseconds (0.0%)
       CoreFoundation : 0.83 milliseconds (0.0%)
               vImage : 0.01 milliseconds (0.0%)
     libGLImage.dylib : 0.14 milliseconds (0.0%)
libFosl_dynamic.dylib : 0.02 milliseconds (0.0%)
            CoreImage : 0.01 milliseconds (0.0%)
        CoreTelephony : 0.01 milliseconds (0.0%)
     HockeySDK_Source : 0.54 milliseconds (0.0%)
                Artsy : 19.80 milliseconds (0.4%)
total symbol trie searches:    87387
total symbol table binary searches:    0
total images defining weak symbols:  15
total images using weak symbols:  46
alloy commented 9 years ago

The timing info can be seen by setting both the DYLD_PRINT_APIS and the DYLD_PRINT_STATISTICS env variables. You’d expect just the latter, but it only seems to get printed when combined with the former. The timing info will be in the middle of all the API usage.

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/dyld.1.html

OliverLetterer commented 9 years ago

@alloy I'm just curious because I recently reevaluated CocoaPods dynamic frameworks vs. statically linking all of them in one shared dynamic framework. Sry if this has already been answered or discussed anywhere else. Is this tested with or without an attached debugger? My Podfile looks like this

source "git@github.com:Sparrow-Labs/Specs.git"
source "https://github.com/CocoaPods/Specs.git"

use_frameworks!

def framework_dependencies
    pod "AFNetworking", "2.5.3"
    pod "MBProgressHUD", "~> 0.9.1"
    pod "SGKeychain", "~> 1.1.0"
    pod "KissXML", "~> 5.0.0.2"
    pod "CCVFly", "~> 0.3.3"
    pod "SWFSemanticVersion", "~> 0.3.1"

    pod "SHAlertViewBlocks", "~> 1.2.0"
    pod "SHBarButtonItemBlocks", "~> 1.2.0"
    pod "SHControlBlocks", "~> 1.2.0"
    pod "SHControlBlocks", "~> 1.2.0"
    pod "SHKeyValueObserverBlocks", "~> 2.0.0"

    pod "CloudBridge", "~> 1.2.0"
    pod "CBRRESTConnection", "~> 1.2.0"
    pod "CBRRelationshipResolver", "~> 1.0.0"

    pod "CTOpenSSLWrapper", "~> 1.3.0"
    pod "Flow", "~> 1.6.1"
    pod "SPLBadgeView", "~> 1.0.0"
    pod "SLObjectiveCRuntimeAdditions", "~> 1.1.0"
    pod "SLScrollViewKeyboardSupport", "~> 1.1.0"
    pod "SPLSpeechBubblePopoverController", "~> 2.1.3"
    pod "SPLTextFieldChain", "~> 1.0.0"
    pod "SPLTransparentDropShadowView", "~> 1.0.0"
    pod "SPLFoundation", "~> 1.4.2"
    pod "SPLMobilePrinting", "~> 2.4.4"
    pod "SPLOptionsViewController", "~> 1.1.2"
    pod "SPLStickyHeaderTableViewController", "~> 0.1.3"
    pod "SPLFormular", "~> 0.8.1"

    pod "MultipeerCommunication", "~> 0.4.0"
    pod "MCSwipeTableViewCell", "~> 2.1.2"
end

target "CocoaPodsDemo" do
    framework_dependencies

    pod "pop", "~> 1.0.0", :inhibit_warnings => true
    pod "Shimmer", "~> 1.0.2"
    pod "CDZQRScanningViewController", "~> 1.0.7.3"

    pod "UITextField+SPLAutoCompletion", "~> 1.0.0"
    pod "UIAlertControllerConvenienceAPI", "~> 0.2.0"

    pod "SPLPing", "~> 1.1.1"
    pod "SPLWindow", "~> 1.3.1"
    pod "SPLInlayLabel", "~> 1.1.0"
    pod "SPLMobilePayment", "~> 1.2.0"
    pod "SPLRoundImageCache", "~> 1.0.0"
    pod "SPLPagesViewController", "~> 1.0.0"
    pod "SPLTiltedTabsViewController", "~> 1.0.0"
    pod "SPLZoomingViewControllerTransition", "~> 1.1.2"
    pod "SPLBlurredBackgroundViewController", "~> 1.0.0"
    pod "SPLTableViewRowReorderGestureRecognizer", "~> 1.0.0"
end

target "DynamicFramework" do
    framework_dependencies
end

which results in 52 CocoaPods targets. Launch time on my end with a basic demo app with debugger is ~10s. Without debugger <1s which is reasonably fast enough for my use case.

orta commented 9 years ago

I noticed this initially on a device beta which doesn't have a debugger attached, though I think it's definitely worth a look.

johnno1962 commented 9 years ago

@alloy, got the "total images loading time” down from 2.5 seconds to 0.9 seconds by statically linking all frameworks’ object flles into the app. Add the following as a "run script" before the app is linked:

ls -t $OBJROOT/Pods.build/$CONFIGURATION$EFFECTIVE_PLATFORM_NAME/Pods-*.build/Objects-normal/$CURRENT_ARCH/*.o | egrep -v '/(_?FBSDK|BF|BIT)|Dummy' >$OBJROOT/Pods.build/$CONFIGURATION$EFFECTIVE_PLATFORM_NAME.filelist

and replace all -framework arguments in the .xcconfig’s OTHER_LDFLAGS with -filelist $OBJROOT/Pods.build/$CONFIGURATION$EFFECTIVE_PLATFORM_NAME.filelist

This leaves one pod "Artsy+OSSUIFonts” which uses [NSBundle bundleForClass:self] to locate resources in the pod which can be added to the main project now that ARFontLoader is in the main bundle.

alloy commented 9 years ago

@OliverLetterer I think that might actually be it. It’s fast without a debugger, so possibly @orta was running into another issue and it appeared to me to be this. Not sure, will have to take another look tomorrow when I’m sharper.

alloy commented 9 years ago

@johnno1962 That’s an impressive command! But I wasn’t looking for static linking, we can do that simply by omitting use_frameworks! from the Podfile, the point was more about finding a solution for dynamic linking.

johnno1962 commented 9 years ago

Oops, shows how much I know... Works for Swift micro-frameworks though so could be handy for inlining frameworks in future.

alloy commented 9 years ago

@johnno1962 Good point! And thanks for taking the time anyways, there can never be enough knowledge sharing ;)

alloy commented 8 years ago

Just an update, the performance has been getting better over the betas, and without a debugger attached the impact is only really noticeable the first time an app is launched. Still something that I’d like to improve, but not a blocker for us anymore.

stepanhruda commented 8 years ago

We're currently investigating this as well.

If the system runs low on memory, it can purge the dynamic frameworks out of the shared cache, so linking can potentially take a long time more often than just the first time the app is launched.

@alloy Did you have any possible approaches in mind I could look into, other than prebinding?

stepanhruda commented 8 years ago

I'm looking into lazy loading, but so far all documentation seems to say that symbols are loaded lazily by default unless you use -bind_at_load when generating the library. Another possible cause could using -load_all in the executable, but we don't do either.

orta commented 8 years ago

Could CocoaPods be sneaking in a load_all?

neonichu commented 8 years ago

CP itself doesn't do it, but there are tons of Pods which specify -all_load

alloy commented 8 years ago

@stepanhruda Nope, I would probably go down the same avenue you are currently going. The point about the shared cache being purged is a very good one and might have been a thing why I noticed it less at some point, i.e. purely because I had more free memory. Please keep me posted, as this probably means I’d have to look at it soon again as well.

stepanhruda commented 8 years ago

I did run a git grep load_all git grep all_load, but didn't see it actually being passed to the compiler anywhere (although documentation for some Pods mentions it). I wonder if -ObjC could be causing this, although there's no way we could drop that since it loads Objective-C categories.

I tried using DYLD_PRINT_BINDINGS, which could tell me which symbols are actually bound lazily, but that takes forever since it literally prints every single symbol the executable is linked against.

I'm going to do debug dyld to see what the time breakdown for loading all library images looks like.

stepanhruda commented 8 years ago

Prebinding seems to be abandoned since OS X 10.4, so I'm not going to spend time on it, it seems that it did not work for @alloy above anyway.

orta commented 8 years ago

Same FWIW:

~/d/i/a/eigen (master) ⏛  grep -r all_load .
./Pods/Expecta/README.md:6. Add `-ObjC` and `-all_load` to the **Other Linker Flags** build setting for the test target in your Xcode project.
./Pods/Specta/README.md:6. Add `-ObjC` and `-all_load` to the "Other Linker Flags" build setting for the test target in your Xcode project.
alloy commented 8 years ago

@stepanhruda It’s probably a good idea to first verify that lazy loading is actually on by default, on iOS with Objective-C class+categories containing libs, by creating a clean slate project and manually adding a few frameworks.

(So no CP.)

stepanhruda commented 8 years ago

Also, you guys have this tagged iOS 9, but I'm currently compiling against iOS 8 running on an iOS 8 device and seeing long linking times, so I don't think it's necessarily related.

alloy commented 8 years ago

@stepanhruda That’s true, it was more of a conflation with our frameworks+iOS 9 sprint. I’ll remove it to avoid further confusion.

stepanhruda commented 8 years ago

I wonder if dyld isn't in fact what's taking that long? Here's a console output:

Wed Sep 23 10:44:25 EDT 2015
_dyld_register_func_for_add_image(0x399acda1)
Wed Sep 23 10:44:25 EDT 2015
_dyld_register_func_for_add_image(0x39939c95)
_dyld_get_image_slide(0x2c000)
_dyld_register_func_for_remove_image(0x39939ce1)
_dyld_register_func_for_remove_image(0x39392dd9)
dyld_register_image_state_change_handler(40, 1, 0x39392c51)
dyld_image_path_containing_address(0x24c1000)
Wed Sep 23 10:44:26 EDT 2015
dlopen(/usr/lib/system/introspection/libdispatch.dylib, 0x00000010)
  dlopen(/usr/lib/system/introspection/libdispatch.dylib) ==> 0x1fec93a4

# Omitted, a ton of other images linked here

Wed Sep 23 10:44:29 EDT 2015
dlopen(/System/Library/Frameworks/UIKit.framework/UIKit, 0x00000010)
  dlopen(/System/Library/Frameworks/UIKit.framework/UIKit) ==> 0x1fec7224
dlsym(0x1fec7224, UIApplicationDidEnterBackgroundNotification)
dlsym(0x1fec7224, UIApplicationWillEnterForegroundNotification)
total time: 20.2 seconds (100.0%)
total images loaded:  299 (231 from dyld shared cache)
total segments mapped: 201, into 6675 pages with 532 pages pre-fetched
total images loading time: 14.6 seconds (71.9%)

Although the reported time is 20.2 seconds, the wall clock (provided by shell's date) seems to have only advanced by 4 seconds since the first dyld function call.

stepanhruda commented 8 years ago

Diving into dyld to see what total images loading time actually measures:

uint64_t t0 = mach_absolute_time();
this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths);
context.notifyBatch(dyld_image_state_dependents_mapped);

// we only do the loading step for preflights
if ( preflightOnly )
    return;

uint64_t t1 = mach_absolute_time();

// [...]

fgTotalLoadLibrariesTime += t1 - t0;

// [...]

printTime("total images loading time", fgTotalLoadLibrariesTime, totalTime);
stepanhruda commented 8 years ago

I think the slowness is happening inside ImageLoader::recursiveLoadLibraries, but I can't seem to hit any symbols to profile this method further :cry:

Right now I'm thinking the library images get loaded with the image of the executable, before Xcode attaches lldb. Perhaps I'd need to jailbreak the device and do even deeper hackery to profile this.

orta commented 8 years ago

Some notes from @samdmarshall

I'd file a new radar on it -- dyld will have to allocate the same amount of memory for paging regardless of frameworks or you linking it all statically

so it seems like a problem with the page allocation not being performant

on iOS i don't know if you can without jailbreaking

you can look at the imp of dyld on opensource.apple.com

the only additional difference would be that you need to open more file descriptors with frameworks that if you were to statically link

so your limitations are file IO speed to open then map the file to a memory page (and allocation of said page) happening repeatedly rather than once

which might be a problem with dyld in general

so the system resources: file IO to access the file, allocate the page, load it to the page. then the software time to map the binary, and repeat the process of resolving dependencies and then doing symbol mapping for that in the global symbol table

orta commented 8 years ago

And just to ensure all context is in one place, @marcelofabri filed a DTS on it this here - https://github.com/artsy/eidolon/issues/491#issuecomment-136402233

alloy commented 8 years ago

@stepanhruda Can you possibly paste a brain dump of how you arrived at the idea of where the problem occurs?

I’m gonna try to make some time next week to dive into this issue again as well.

stepanhruda commented 8 years ago

My line of thought right now:

  1. Based on DYLD_PRINT_STATISTICS, we know that total images loading time is what hangs for so long. In everyone's provided outputs, it covers 70-90% of dyld's reported running time.
  2. Looking through dyld's source, total images loading time simply prints the value of fgTotalLoadLibrariesTime.
  3. fgTotalLoadLibrariesTime measures only a handful of synchronous calls, by far the most suspicious being ImageLoader::recursiveLoadLibraries.
  4. In ImageLoader::recursiveLoadLibraries every library is loaded through context.loadLibrary, which I believe is a function pointer to load (my C++ skills are nonexistent), which looks up the frameworks in @rpath.
  5. Going back, ImageLoader::recursiveLoadLibraries does what the name suggests – it traverses the whole dependency graph and loads all library images. I'm not able to profile which part is slow, because all of this happens before lldb gets attached, but I have a couple of ideas what is slow:
    • if the traversal is generally efficient, disk access to a couple of dozen library images could be slow enough, as @orta & @samdmarshall suggests above
    • traversing the dependency graph can at worst be an O(N^2) operation – if it happens that there is an inefficiency (unnecessary disk access) or e.g. if checking whether an image is already loaded is expensive, the cost quickly grows out of proportions

I also successfully copied symbols into the app executable as @johnno1962 suggested above, which cut the load time down by 400%. But I think he mentioned this didn't work for him in production builds in the end?

I gotta say I'm slightly out of ideas for how one could successfully ship an executable with tons of microframeworks.

samdmarshall commented 8 years ago

I believe the images are checked if loaded by string comparison on the install path of the library. I would want to investigate if the repeated page creation was the expensive part, or if the libraries being loaded were not aligned with page size and this incurring a higher cost.

As mentioned to alloy and orta earlier, I would want to investigate if this is a platform problem (page allocation and perf) or a dyld problem (symbol and dependency resolution) before trying to dive into the incomplete source code provided by Apple further.

johnno1962 commented 8 years ago

As a suggestion, I did toy with the idea of using static linking along with reording the classes of a project to be the order in which they are used (determined by producing an old style profile file by swizzling +alloc). Flash drives are random access anyway so this wouldn’t be a big gain unless you have an app with facebook scale but paging would at least be more orderly. Question is how big is a class relative to page size. If Facebook is 13000 classes in 100 meg you can work it out...

stepanhruda commented 8 years ago

@johnno1962 I'm not sure I understand. I did succeed at moving the symbols into the final executable as you suggest in Accelerator. I thought this didn't work for you because in the end dyld decided to load the library images anyway. Is there another issue?

Other than that, I did see a massive drop in image load time after I copied all the symbols over before building the final binary.

johnno1962 commented 8 years ago

It’s Good news for Accelerator you did manage to observe a drop in load times statically linking! A couple of people observed in https://github.com/artsy/eidolon/issues/491#issuecomment-136402233 that this was likely related to the debugger loading symbols more than anything and less of an issue outside the debugger. My comment above was making a further suggestion to reorder the classes in the project according to when they are used but I doubt this would make any difference unless you have a very big executable with very small classes.

brianmichel commented 8 years ago

Would love to help if anyone needs someone to investigate anything. We're experiencing similar issues with our app.

stepanhruda commented 8 years ago

I am going to experiment with copying all symbols into the app executable today and report back.

alloy commented 8 years ago

@stepanhruda Btw, for better profiling, I was going to look at it by creating an empty application, embed all of your real frameworks but only link them dynamically at runtime. Then one should hopefully be able to profile the full loading experience.

stepanhruda commented 8 years ago

Good news (hopefully, fingers crossed).

I managed to manually deploy a proof-of-concept enterprise build, using the hack @johnno1962 suggests in https://github.com/johnno1962/Accelerator, and it seems to resolve the immediate issue, although it for sure shouldn't be a long-term solution. Here is my next course of action:

  1. Integrate the hack into our automatic build pipeline, so we can test it in beta builds (of a private repo).
  2. If beta builds perform well, submit to iTunes Connect to verify apps doing this don't get rejected.
  3. Abstract the solution so other developers facing the same issue get unblocked. Right now my plan is to ship a cocoapods plugin.
  4. Investigate manual dyld image loading after launch as @alloy suggested above. File a radar.
stepanhruda commented 8 years ago

Still waiting on my workaround to be approved or rejected by TestFlight review.

In the meantime I created an example project that makes the issue dead simple to reproduce and wrote a (hopefully) comprehensible summary: https://github.com/stepanhruda/dyld-image-loading-performance

Any feedback is appreciated, will be filing a radar soon.

orta commented 8 years ago

http://opensource.apple.com/release/ios-90/ <- new source for dyld might be in here

stepanhruda commented 8 years ago

I was investigating the source through https://github.com/opensource-apple/dyld

(You can't run & debug dyld that iOS uses without jailbreak though.)

samdmarshall commented 8 years ago

The github isn't reliably up to date with the latest sources. (also, dyld only gets published as part of OS X packages)

fluidsonic commented 8 years ago

Our app with 62 dynamic frameworks (73 binaries when including libSwift* and the app itself) also suddenly takes 3-4 seconds to load on iOS 8 and 9. When looking for the cause we found this issue.

Here our observations:

So this looks a lot like an issue with Code Signing :(

alloy commented 8 years ago

@fluidsonic Oh, nice sleuthing! Are you going to do something like request a DTS by any chance?

fluidsonic commented 8 years ago

@alloy what is a DTS? :)

I filed an additional radar 22948371.

samdmarshall commented 8 years ago

This has to do with system integrity and verification of everything being run. You won't find that there is a way around this for complex projects. A thought I had the other day was to see if this problem persists if you are using tree based dependencies rather than a flat list. +20 libraries is a lot for the app to link directly, so if that was changed to be linked to reduce the number of dependencies per node of the overall graph if it would make this a bit better.

alloy commented 8 years ago

@fluidsonic It’s ‘Developer Technical Support’, which is something you only get to do so many times a year, so it’s nice to coordinate and not waste any :) https://developer.apple.com/support/technical/

Having said that, I just remembered that @marcelofabri has requested one, maybe he could follow-up on that one, as the answer he got did not cover any of this: https://github.com/artsy/eidolon/issues/491#issuecomment-136402233

fluidsonic commented 8 years ago

@alloy ah, sure! Never needed them the past 6 years but this is a good case to request a ticket. I also think it makes sense to wait for a follow-up of @marcelofabri's ticket first.