cirruslabs / tart

macOS and Linux VMs on Apple Silicon to use in CI and other automations
https://tart.run
Other
3.65k stars 102 forks source link

Progress during IPSW download #767

Closed torarnv closed 3 months ago

torarnv commented 3 months ago

Hi,

I noticed that tart create --from-ipsw <url> didn't report progress for the download, and naively implemented:

diff --cc Sources/tart/Fetcher.swift
index 461ad59,461ad59..ab8a8ee
--- a/Sources/tart/Fetcher.swift
+++ b/Sources/tart/Fetcher.swift
@@@ -29,18 -29,18 +29,26 @@@ class Fetcher 
    private static func fetchViaFile(_ request: URLRequest) async throws -> (AsyncThrowingChannel<Data, Error>, HTTPURLResponse) {
      let dataCh = AsyncThrowingChannel<Data, Error>()

--    let (fileURL, response) = try await urlSession.download(for: request)
++    let (asyncBytes, response) = try await urlSession.bytes(for: request)

--    // Acquire a handle to the downloaded file and then remove it.
--    //
--    // This keeps a working reference to that file, yet we don't
--    // have to deal with the cleanup any more.
--    let mappedFile = try Data(contentsOf: fileURL, options: [.alwaysMapped])
--    try FileManager.default.removeItem(at: fileURL)
++    let chunkSize = 64 * 1024 * 1024

      Task {
--      for chunk in (0 ..< mappedFile.count).chunks(ofCount: 64 * 1024 * 1024) {
--        await dataCh.send(mappedFile.subdata(in: chunk))
++      let expectedLength = response.expectedContentLength
++
++      var chunk = Data()
++      if expectedLength > 0 {
++          chunk.reserveCapacity(min(chunkSize, Int(expectedLength)))
++      } else {
++          chunk.reserveCapacity(chunkSize)
++      }
++
++      for try await byte in asyncBytes {
++        chunk.append(byte)
++        if chunk.count >= chunkSize {
++          await dataCh.send(chunk)
++          chunk.removeAll(keepingCapacity: true)
++        }
        }

        dataCh.finish()

But looking at git blame for this logic there seems to have been quite some variants of the download code (#260, #262, #282, #306), including one that used bytes.

@edigaryev @fiedukow Could you shed some light on the history of Fetcher.fetchViaFile and why it is needed? It seems to have been architected in https://github.com/cirruslabs/tart/pull/280#issuecomment-1300105314

For IPSWs, do we gain anything by downloading to a file first (and waiting for it to complete the download), and then delivering that file in chunks to retrieveIPSW that writes it to another file (and computes a digest)?

Should retrieveIPSW perhaps go via fetchViaMemory instead, and fetchViaMemory thought to use bytes instead of the blocking data call?