promotedai / ios-metrics-sdk

iOS client library for Promoted.ai metrics tracking.
MIT License
7 stars 1 forks source link

Support MobileDiagnostics and more detailed on-device diagnostics #140

Closed yunapotamus closed 3 years ago

yunapotamus commented 3 years ago

In order to support diagnostics, several large changes were made:

  1. Dropped support for Swift 5.2/Xcode 11.
  2. Added a Deque data type for use as a cache/buffer.
  3. Expanded Xray/OSLog support to include many more details and fine-grained control of logging activity.
  4. Created TabularLogFormatter to format data as text tables in log output.
  5. Support logging of mobile diagnostics

These changes are discussed below.

Dropping Swift 5.2/Xcode 11

The consequence of this is that clients will no longer be able to use Xcode 11 to build our library. No current users of the library are on Xcode 11, and we don't anticipate the need to support this version of Xcode in the future. Additionally, this does not affect our ability to target older versions of iOS, and we remain compatible with iOS versions as far back as iOS 10 (four versions behind as of this writing).

Some factors that went into this decision:

Deque

After another need for a bounded buffer, I introduced a Deque class because Swift 5.3 does not offer a built-in bounded buffer. It's currently backed by an Array, which has O(n) complexity for popFront operations, but we could replace this easily if we need better performance. (In practice, this performance degradation doesn't start to show up until we exceed ~50 elements in the buffer, and our buffers are much smaller.)

image

(Image credit to Karoy Lorentey of the Swift.org blog)

Some alternatives considered, and why we didn't choose them:

  1. Use Swift Collections Deque. While it's a robust implementation, Swift Collections doesn't support CocoaPods, which many of our clients use.
  2. Use another Deque implementation from the internet. It's hard to find an implementation that supports both Swift PM and CocoaPods. Also, security concerns over using third-party code.
  3. Stick with built-in Array type. This would require us to duplicate the bounding logic in many places, and it's too easy for users to get this wrong. You can't subclass Swift's Array because it's a struct and not a class.
  4. Use Objective C's NSCache or similar data structure. This would only work with Objective C-compatible objects, and not Swift structs or primitive types, which means that it can't hold Swift Protobufs (they are structs).

Expanded Xray/OSLog control

As stated in #130, we expanded the control over Xray and OSLog to allow users to select how much data is logged. This is done with enums:

public enum XrayLevel: Int, Comparable {
  // Don't gather any Xray stats data at all.
  case none = 0
  // Gather overall counts for the session for each batch.
  // ie. batches: 40, batches sent successfully: 39, errors: 1
  case batchSummaries = 1
  // Gather stats and logged messages for each call made
  // to the metrics library.
  case callDetails = 2
  // Gathers stats and logged messages for each call made
  // to the metrics library, as well as stack traces where the
  // calls were made.
  case callDetailsAndStackTraces = 3
}

public enum OSLogLevel: Int, Comparable {
  /// No logging for anything.
  case none = 0
  /// Logging only for errors.
  case error = 1
  /// Logging for errors and warnings.
  case warning = 2
  /// Logging for info messages (and above).
  case info = 3
  /// Logging for debug messages (and above).
  case debug = 4
}

For Xray, the motivation for this was that storing all call site data for the mobile diagnostics feature was not necessary. Mobile diagnostics only depend on Xray's summaries for batches.

For OSLog, we had promised features to inspect more of the data that we're sending out. The support for logging levels allowed us to output very detailed information about the data we're sending without overwhelming users who need less fine-grained logging. An example of this detailed information is below.

TabularLogFormatter

When logLevel >= .debug and xrayLevel >= .callDetails, we will log verbose details of the operations and messages contained in each batch. This offers complete transparency of our logging activity as well as a powerful debugging tool.

Example of batch operation details:

Operations in Batch 1 (Millis: 145, Calls: 4, Message Count: 4, Message Bytes: 761)
 Operation                 |     Millis |  Msg Count |  Msg Bytes | Summary                        
---------------------------------------------------------------------------------------------------
 logAction(name:type:cont… |         13 |          1 |         72 | Act(Load User, 1469)           
 logView(trackerState:pro… |         27 |          1 |        115 | View(Home, A140)               
 startSessionAndLogUser(u… |          6 |          1 |          9 | User                           
 collectionViewDidChangeV… |          7 |          1 |        151 | Imp(AF4D)

Example of batch message details:

Messages in Batch 1
 Type       | Name                      | LogUserID                            | ViewID                               | ImpressionID                         | ActionID                             
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Action     | Load User                 | CF13E0AF-CABA-4F3B-9E02-F357B8574FF0 |                                      |                                      | 16B5CFAF-5B0D-493D-A264-3B67DA51E1FF 
 View       | Home                      | CF13E0AF-CABA-4F3B-9E02-F357B8574FF0 | 2B6BB054-B386-43E2-B3E8-24703E03D646 | -                                    | -                                    
 User       | -                         | -                                    | -                                    | -                                    | -                                    
 Impression | -                         | CF13E0AF-CABA-4F3B-9E02-F357B8574FF0 | 2B6BB054-B386-43E2-B3E8-24703E03D646 | F2E9EE0B-7D61-4E30-934D-4676516780D9 | -

Mobile diagnostics

Fill out the MobileDiagnostics proto on LogRequest when asked to do so. In service of this feature:

  1. MetricsLogger has a new area of functionality dedicated to diagnostics. This includes filling out the aforementioned protos and tracking ancestor ID history.
  2. Xray now records the number of batches attempted.
  3. The Promoted library version is now loaded from a file at Sources/PromotedCore/Resources/Version.txt so that it's available at runtime.

Example of logged diagnostics:

timing {
  client_log_timestamp: 1624674131321
}
device_identifier: "11FF76F7-3812-4F58-B4D2-DC31016E21ED"
client_version: "6.0.0 build 2135"
promoted_library_version: "0.4.5"
batches_attempted: 2
batches_sent_successfully: 1
batches_with_errors: 1
error_history {
  ios_errors {
    code: 500
    domain: "ai.promoted"
    description: "Error Domain=ai.promoted Code=500 \"(null)\" UserInfo={description=foo}"
    batch_number: 1
  }
  total_errors: 1
}
ancestor_id_history {
  ancestor_id_history {
    ancestor_id: "CF13E0AF-CABA-4F3B-9E02-F357B8574FF0"
    user_event {
      timing {
        client_log_timestamp: 1624674111965
      }
    }
    batch_number: 1
  }
  session_id_history {
    ancestor_id: "08A93F8B-70C8-4149-9E71-7CEB7787835F"
    batch_number: 1
  }
  view_id_history {
    ancestor_id: "310A8F0C-182C-4D72-B324-9BEC6628903C"
    view_event {
      timing {
        client_log_timestamp: 1624674111678
      }
      view_id: "310A8F0C-182C-4D72-B324-9BEC6628903C"
      name: "Home"
      device {
        device_type: MOBILE
        brand: "Apple"
        manufacturer: "Apple"
        identifier: "x86_64"
        os_version: "14.5"
        screen {
          size {
            width: 828
            height: 1792
          }
          scale: 2.0
        }
      }
      view_type: APP_SCREEN
      app_screen_view {
      }
      locale {
        language_code: "en"
        region_code: "US"
      }
    }
    batch_number: 1
  }
  view_id_history {
    ancestor_id: "DA861842-52AD-4E54-A53D-ADC658D7EED3"
    view_event {
      timing {
        client_log_timestamp: 1624674121342
      }
      view_id: "DA861842-52AD-4E54-A53D-ADC658D7EED3"
      session_id: "08A93F8B-70C8-4149-9E71-7CEB7787835F"
      name: "Store"
      device {
        device_type: MOBILE
        brand: "Apple"
        manufacturer: "Apple"
        identifier: "x86_64"
        os_version: "14.5"
        screen {
          size {
            width: 828
            height: 1792
          }
          scale: 2.0
        }
      }
      view_type: APP_SCREEN
      app_screen_view {
      }
      locale {
        language_code: "en"
        region_code: "US"
      }
    }
    batch_number: 2
  }
}

Testing

In addition to tests for this new functionality, introduced ModuleTests, which will catch any circular dependencies in the library. As our dependency graph grows more complex, this helps to sanity check things.