SwiftPackageIndex / SwiftPackageIndex-Server

The Swift Package Index is the place to find Swift packages!
https://swiftpackageindex.com
Apache License 2.0
547 stars 43 forks source link

Properly fix doc generation issue #3137

Closed finestructure closed 2 months ago

finestructure commented 3 months ago

Follow-up to https://github.com/SwiftPackageIndex/SwiftPackageIndex-Server/issues/3134

We fixed it by rolling back to DocUploader 1.6.3 but that likely re-introduced issues with swift-metrics (#3069) this change meant to address. Figure out how to use DocUploader 1.7.3 (or a modified version) in builder for zipping that doesn't create empty archives.

finestructure commented 3 months ago

FWIW, recently swift-metrics doc builds and uploads have succeeded:

CleanShot 2024-06-27 at 10 42 47@2x

but the doc upload record is still stuck in the uploading state:

2024-06-22 02:04:00.255569+00 | https://github.com/apple/swift-metrics.git | 578 | 4 | default_branch | main | uploading |   |   | https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/7161559981 | a43bfe21-d63d-4069-8a19-da7ac00aaa92 | macos-spm | {"major": 5, "minor": 10, "patch": 0} | 0d9af2fc-20ea-4a11-979d-6ef68bf9d888

CleanShot 2024-06-27 at 10 44 41@2x

One caveat: while 2.5.0 clearly must have uploaded, the same can't necessarily be concluded for main.

finestructure commented 3 months ago

Ok, main docs are actually a 404:

CleanShot 2024-06-27 at 10 50 23@2x

because there's no doc archive in S3:

CleanShot 2024-06-27 at 10 49 28@2x

finestructure commented 3 months ago

What's truly bizarre is that 2.5.0 is the exact same git commit (e0165b53d49b413dd987526b641e05e246782685) as the tag 2.5.0. One works fine, the other doesn't?

finestructure commented 3 months ago

My fix using os level unzip in case the library zip fails doesn't work. While the image we use to build the uploader has unzip, the actual runtime where it's being executed doesn't.

finestructure commented 3 months ago

Current status:

There isn't really a good choice here. Options are

  1. try and figure out why marmelroy/Zip can't roundtrip the metrics archive - I'll give this a timeboxed try to see if I can figure out what exactly it is in the archive that trips this up
  2. use marmelroy/Zip for zipping and weichsel/ZIPFoundation for unzipping - meh, but by far the easiest option
  3. use weichsel/ZIPFoundation and deal with the recursive directory zipping ourselves - doable but I don't love having to mess with lots of path building in a critical piece
  4. we don't have unzip in the Lambda env but we do have control over zip/unzip in the builder. I.e. we could either generally zip with the os level zip (assuming, to be tested, that an os level zip of metrics can be unzipped by marmelroy/Zip) or try roundtripping it before uploading and then rezip using /usr/bin/zip. The latter is probably more complicated that is worth it.

NB: marmelroy/Zip also has a path traversal CVE against it. It doesn't affect us, because we only unzip our own archives but it's still a warning we could do without. It's been there for weeks, so it's unlikely to be addressed. (It also doesn't look too complicated to provide a patch for but there's been no activity on the repo, so chances are it wouldn't be merged and we'd end up maintaining yet another a fork.)

Going to give 1. a brief try and then 4. seems like our best option.

daveverwer commented 3 months ago

There isn't really a good choice here. Options are

  1. Explore potential other archive options like tar archives (with zipping, of course) that might open up the packages we could use. For example, from a quick look https://swiftpackageindex.com/tsolomko/SWCompression looks like it might be a possibility for a mature yet still maintained package. It also supports plain zip files too.
daveverwer commented 3 months ago

For extraction, tar may be on the lambda image, too.

finestructure commented 3 months ago

tar has the same problem as unzip, unfortunately - it's not available.

I'd also rather not change the format, because it's being used in two different components, making it trickier to deploy. However, SWCompression seems to support zip, so that's definitely another candidate to try with the problematic metrics zip file.

finestructure commented 3 months ago

Unfortunately, SWCompression doesn't seem to have a file/URL API, which is the same problem that SwiftZip/SwiftZip has. The nice thing about marmelroy/Zip is that you can give it a list of file and folder URLs and it handles discovery and archiving of all content, pretty much exactly like the equivalent zip/unzip commands would.

finestructure commented 3 months ago

Ok, the problem is apparently the 0 bytes theme-settings.json in that archive. Checking a few other archives that seems to be unusual. We can certainly make sure we don't leave those in the archive in the builder (although it's odd that the archiver should trip over it but 🤷‍♂️).

finestructure commented 3 months ago

Odd, simply moving that file out of and back into the folder makes it round-trippable. This should be easy enough to detect in the builder and report back.

finestructure commented 3 months ago

I can't reproduce the unzip failure with a newly created doc archive for the same revision in a builder test. Either the env is slightly different or something else has changed since I pulled the "broken" archive from the S3 inbox. FWIW, it didn't seem an isolated "broken" archive - the main revision docs failed to upload in multiple attempts. So whatever caused the "broken" archive then did it consistently.

I'll redeploy an updated builder with all the latest changes when Xcode 16b2 processing is done to see if we still produce these "broken" archives.

finestructure commented 3 months ago

Ok, the problem is apparently the 0 bytes theme-settings.json in that archive. Checking a few other archives that seems to be unusual.

We're actually explicitly touching theme-settings.json in order to avoid 404s in the console when viewing DocC docs (https://github.com/SwiftPackageIndex/SwiftPackageIndex-Server/issues/2707). So if this was truly the problem it'd be more widespread. It must have been something else about the archive and it's probably just the fact that I nudged the metadata when moving files that "fixed" the archive.

finestructure commented 3 months ago

This is fascinating. I can finally reproduce the round-trip unzipFail exception with swift-metrics. The reason I couldn't initially is that I pinned it to main's reference e0165b53d49b413dd987526b641e05e246782685 so the test wouldn't drift with changes to swift-metrics' main. However that actually makes the round-trip work. It only raises the exception if I actually check out and doc gen with the reference main instead of the sha.

I'll see if the error persists in a fork of swift-metrics so I can still pin it in the test.

finestructure commented 3 months ago

So I've reproduced the zip-roundtrip issue, fixed it via doc uploader 1.9.0 and builder 4.45.1, confirmed it's rezipping and when re-processing swift-metrics with both these fixed versions we get

2024-07-05T15:23:33+0000 warning Lambda : lifecycleIteration=0 [AWSLambdaRuntimeCore] lambda handler returned an error: unzipFail

https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#logsV2:log-groups/log-group/$252Faws$252Flambda$252FDocUploaderLambda-Prod-UploadFunction-GGu55JqjJ0Lz/log-events/2024$252F07$252F05$252F$255B$2524LATEST$255Daf20f64812644dc6ba942134b612469d

It's maddening.

finestructure commented 3 months ago

Ok, it's clear what's going wrong here. It's a path issue when re-zipping the input with zip.

finestructure commented 3 months ago

Roundtrip unzip check is now failing with DocUploader 1.9.1 and builder 4.45.2: Error while generating docs: fileNotFound

https://gitlab.com/finestructure/swiftpackageindex-builder/-/jobs/7278725552

Something must be different between the unit test and the prod environments.

finestructure commented 2 months ago

This has been properly fixed now via DocUploader 1.9.1. main docs are being uploaded now:

https://swiftpackageindex.com/apple/swift-metrics/main/documentation/coremetrics

CleanShot 2024-07-17 at 10 40 18@2x