matplotlib / pytest-mpl

A pytest plugin to facilitate image comparison for Matplotlib figures
Other
244 stars 47 forks source link

Perceptual Image Hash #146

Open bjlittle opened 2 years ago

bjlittle commented 2 years ago

Thanks and congrats on pytest-mpl. It's really lovely :+1:

Over on SciTools/iris we've been using a very similar framework to perform graphical testing, but rather than using RMS or SHA-256 hashes we've opted to use perceptual image hashes.

Historically, we found using RMS/SHA-256 quite fragile and too sensitive to change, and that sometimes involved quite an overhead on devs to manually understand whether graphical test failures were significant or not; typically, it really depended on the nature of the image change involved, and that generally took time and effort to figure out.

However, for the last few years now we've adopted perceptual image hashing as a more robust and stable approach to graphical image testing, and we'd love to be able to use pytest-mpl as a general framework (rather than reinvent the wheel) as we're keen to use perceptual image hashing elsewhere in other packages other than iris.

If you want to know more about perceptual image hashing:

In the end, we opted to leverage the https://github.com/JohannesBuchner/imagehash package, which is available on both conda-forge and pypi, to perform the actual perceptual image hashing.

At this point, I guess my question to the pytest-mpl core developers are:

  1. Have you considered using perceptual image hashing as an approach and decided not to use it? If so, why? That would be really great to know.
  2. If you're not against the concept, would you be open to us making an additive change to pytest-mpl where the user can choose to opt-in to an alternative hashing algorithm, that they configure, to perform their graphical testing?

I think that option [2] would be possible and a valuable feature for the pytest-mpl community.

However, you guys have the vision of where you want pytest-mpl go and the use cases that you want to address. It would be really great to know whether you're open to offering configurable hashing kernels as a feature. If you are, then we'd be more than willing to do the work and contribute to pytest-mpl under your guidance and advice.

Thanks :smiley:

trexfeathers commented 2 years ago

What @bjlittle said! 😍

astrofrog commented 2 years ago

I think it would be great to have perceptual hashes as an option! (So option 2 would be great!)

Cadair commented 2 years ago

I am certainly happy to have perceptual hashes as an option.

I have previously given perceptual hashes as quick look over, but assumed that they wouldn't be able to detect things like changes in the location of axes labels or other subtle changes to a plot?

Is this impression correct? How sensitive can they be to changes in the image? What kind of plots are you successfully using them against?

dopplershift commented 2 years ago

Just wanted to add a 👍 from me as well. Every time matplotlib updates, I'm grumbling about the minor adjustments I have to make to about 25% of our image tests--but never enough cycles to do a deep dive into this.

bjlittle commented 2 years ago

We've been using perceptual image hashing as our core algorithm in SciTools/iris for a good couple of years now, so it's got sufficient mileage for me to say with some confidence that it suits our needs and is pretty robust, stable and configurable to tolerances of change in a variety of scientific plots.

We store the images in a separate repo, akin to a pattern that you guys offer, and that works for us. We only download image assets when we actually want to manually inspect the difference with matplotlib.testing.compare to see pixel diffs and approve or reject and investigate the failure.

In fact, the killer use case for us was axes labels, ticks and fonts that shift a pixel here or there, which causes RMS/SHA to trip (even with tolerances... in fact we got to the point where tolerances were hiding genuine failures and caused us pain) but now we have pretty reasonable stability and the ability to fine tune those types of changes out as noise.

As an example of the images we're testing against, it's pretty much the same use case as cartopy (and I'm sure @dopplershift has lost huge amount of his life on graphical test failures):

Also, here's a live PR with image changes, see https://github.com/SciTools/tephi/pull/78

bjlittle commented 2 years ago

I also made a lightning talk a wee while ago for perceptual image hashing to explain it at a high level... I'll try to dig it out and share it with you guys, if that helps too?

dopplershift commented 2 years ago

and I'm sure @dopplershift has lost huge amount of his life on graphical test failures

Too much. It's actually the top item on my "developer pain" todo list, but it's felt like too large of a lift. Seeing it in pytest-mpl would be huge to me.

bjlittle commented 2 years ago

@dopplershift Message received 👍

Righty, I'll engineer some time to spin up on this.

If I have any questions, where's the best place to ask? Here? Or do you guys have a forum? You don't seem to have GH Discussions enabled for pytest-mpl... let me know what's best

Cadair commented 2 years ago

Here or other issues is fine. You can also ping me @Cadair in the matplotlib gitter channel if that's more your liking.

bjlittle commented 2 years ago

@Cadair Awesome, thanks :+1:

I'm working on the change at the moment and it's looking pretty encouraging.

In terms of a strategy moving forwards, I think that it might be best for me to make a draft PR soonish (with a bunch of TODOs to nail) so that you guys can get an early bird feel for what I'm proposing and the general direction that I'm heading in.

When I've got something stable I'll push it for feedback, and we can roll from there.

In general, I'm attempting to be additive and change as little as possible, but the notion of supporting easily configurable hashing kernels seems to be quite achievable... but you guys can be the judge of that :wink:

I do have some open questions atm, but I'll tack them onto the draft PR for discussion or even hop onto the gitter channel :+1:

Cadair commented 2 years ago

I think that it might be best for me to make a draft PR soonish

Yep sounds great :rocket: