Automattic / a8c-ci-toolkit-buildkite-plugin

A caching plugin that can be invoked from your build script.
22 stars 5 forks source link

Add `pnginfo` utility to check PNG files #68

Closed AliSoftware closed 1 year ago

AliSoftware commented 1 year ago

Why?

This utility will be mostly useful to check if a PNG file is of expected size and if it has an alpha channel or not.

The most typical use case for this is to have a CI check validate that iOS App Icons are 1024x1024 and don't have an alpha channel (otherwise they'd be rejected from review), or to validate the size of App Store screenshots.

In the past we've used Mac agents to do this kind of App Icon check, leveraging sips -g hasAlpa to check PNG files. But sips is only available on macOS, hence the Mac agent requirement to run that.

Given the only thing we need is to check the PNG file format, we shouldn't need to have to use a Mac to run such checks. This is why I've implemented this simple ruby tool, that can be used in any Linux agent where ruby is installed, allowing those CI steps for those PNG checks not only to likely run faster (as Linux VMs are faster to bootstrap) but most importantly to avoid those CI steps to request a Mac agent (which we don't have in huge numbers in our CI fleet) and thus reduce the stress on our Mac CI queue.

How?

After a quick check with the official PNG specification, the format ends up not being super complicated to parse, especially since all we're interested in is the content of the IHDR (Image Header) chunk—which is fairly simple—and to check for the presence of a tRNS chunk (which indicates that there is some tRaNSparency data)

PNG data parsing

The implementation for parsing the PNG file goes like this:

The rest of the script is just fancy OptionParser and command-line handling stuff.

Command options

The pnginfo script is designed to be used:

Testing

AliSoftware commented 1 year ago

given the fact that Fastlane already relies on ImageMagick Does it?

I though only our release-toolkit depended on it (for our promo_screenshot_helper), and loosely at that (ie we don't make it a hard dependency in our release-toolkit, but instead dynamically test at runtime if it's installed, and only if one calls the promo_screenshot_helper, so that it's not a requirement to ImageMagick be installed to run any other action of the toolkit), which is also one reason why we have this group :screenshots in our Gemfile to opt-in to rmagick only when explicitly needed…

Besides, installing imagemagick on any machine can notoriously end up being a landmine, and even if it's not to the same level that nokogiri is a landmine to install, I feel it'd be nice to avoid requiring such a heavy dependency for such a little need.


Finally, and probably more importantly, I haven't found any agent in our Tumblr DCA which had ImageMagick installed — while initially the first goal of this small utility is to be able to run the "Validate App Icons" CI check of Tumblr-Android on a Linux machine rather than a Mac.

Sure, we could consider re-provision those Tumblr DCA agents to install imagemagick on them… or install it as part of the .sh script/command/step… but is it really worth it to install the whole imagemagick suite only for that, where the PNG file format is stable and has no reason to change in the future and the IHDR field is so easy to parse ourselves…? 🤔

BTW: Using this pnginfo script in https://github.tumblr.net/TumblrMobile/orangina/pull/25483 allowed me to make the CI step run in 13 seconds, compared to 5m21 before. If we have to install imagemagick as part of the command, even if that shouldn't take that long, that will definitively not beat that record 13s time of using this simpler ruby code to just parse the IHDR 😛

Before
After
crazytonyli commented 1 year ago

An alternative could be using imagemagick on docker. It's likely to be slower than 13 seconds, considering there are pre-built packages to be downloaded. But I imagine it should finish pretty quickly.

docker run -v ./image.png:/image.png --rm --entrypoint /bin/sh public.ecr.aws/docker/library/alpine:latest -c 'apk add imagemagick && identify /image.png'
AliSoftware commented 1 year ago

An alternative could be using imagemagick on docker. It's likely to be slower than 13 seconds, considering there are pre-built packages to be downloaded. But I imagine it should finish pretty quickly.

docker run -v ./image.png:/image.png --rm --entrypoint /bin/sh public.ecr.aws/docker/library/alpine:latest -c 'apk add imagemagick && identify /image.png'

But I think that won't work on tumblr-android agents (that we currently use for Tumblr-iOS linter tasks that don't need a Mac, despite the -android name in those agents, just to have a Linux machine that runs inside Tumblr DCA for direct access to GH:E and faster clones), because tumblr-android agent living in Tumblr DCA don't have access to ECR, right? So we'd have to use an image from our k8s instance instead… and not sure what we have available that would fit the bill?

I guess if there's no ruby involved, we could use tumblr-metal (which is still provisioning with only ruby-2.0.0 which is quite too old for us to do any modern ruby or use bundler and fastlane in those agents). I encountered issues (like "docker daemon not running" and stuff) when I tried earlier, but maybe I got things mixed up as I was experimenting with different approaches and different agents, so I'll give this a new try soon.

We could also consider running that on the default queue in an EC2 machine… but that will mean that we'd need to rely on our GitHub mirror (and the git s3 cache) to clone the repo in those machines that are outside of the Tumblr DCA… and given the huge size of the Tumblr repo, even with s3 cache that takes several minutes (while on tumblr-metal the repo is already cloned locally and shared by all builds running on those agents, so the git clone/git checkout is almost immediate, hence the 13s only in my test above…)


Anyway, I'll try if I manage to run imagemagick via plugin: docker in tumblr-metal, and experiment a bit more to see if I can overcome the issues I had when I tried this earlier, and will report back.

crazytonyli commented 1 year ago

FWIW, we run the terraform jobs on tumblr-metal using Public ECR images and have no issue so far. I'm happy to help you have ran into any issue regarding docker env.

AliSoftware commented 1 year ago

Ok, I finally tested using a dockerized imagemagick on tumblr-metal and after some trial and error and issues with apk add not working behind the HTTPS proxy… I finally figured it out 😅

Which means that this pnginfo utility is probably not needed after all as the same could be achieved with identify like this via this agent and docker plugin setup

As a result, I'm closing this PR, and putting https://github.tumblr.net/TumblrMobile/orangina/pull/25483/ back to "Ready for Review". Thanks for the nudge and help in brainstorming this!

AliSoftware commented 1 year ago

💡 @jkmassel Ok even if I ended up using a dockerized imagemagick for https://github.tumblr.net/TumblrMobile/orangina/pull/25483, one thing that this pnginfo utility could have been useful for though, given that it only relies on Ruby… is that we could have used it as a Danger rule in dangermattic (which @iangmaia is working on rn) to make it easier to use such iOS App Icon validation on other client apps' repos… 🤔 🙃