Closed Morl99 closed 2 years ago
Well, afaik --threshold 0.03
has been always treated as 0.03% so is not a mistake.
I mean, even before the new release I never converted 0.03 to 3% — I'll add such detail on the README, thank you!
Update: anyway, I'll investigate if this comparison it was computed in a different way on past releases, I'll keep you posted.
Hi, I'm also just starting to use this library (thanks btw 👍) and also stumbled over this. While you do not internally need to convert 0.03 to 3% perhaps it would make sense to at least convert it in the output and in the report?!
Hi @matzetronic - yeah, it can be, but the results can include decimals too, so conversion is not really needed imo.
I have to disagree, the readme in the main project says the following:
And in any way, 0.03 just means 3% , as you could write 0.03 as 3/100 which just means 3 from hundred... And it just worked that way until the 0.4.11 release (which is just from my observation).
Yeah, probably I commited a mistake on the docs/code earlier, but now the code is based on pixelmatch and I am using the exact code they're using for getting such percentage:
https://github.com/mapbox/pixelmatch/blob/master/bin/pixelmatch#L37
console.log(`error: ${Math.round(100 * 100 * diffs / (width * height)) / 100}%`);
Probably, and almost surely, I did it wrong first time... and the docs did not helped, or, due the change we should't longer assume that 0.03 is 3% and we should update the docs accordingly? Please let me figure out this by doing a regression check, I don't want assume new release is wrong without confirmation.
So the threshold parameter now has the unit "percent", that seems fine, but this feels like a breaking change that needs to be communicated, so that people don't get surprised by this.
After a conscious investigation I concluded that the previous version was flaky and yielded bad results (blink-diff was treating 0.03 to 3% ?), all my apologies if you're using this module in your daily job. 🙏🏽
To confirm, I generated base images, both for v0.4.11
and v0.4.13
and while the images are the same, the resullts were very different:
0.4.11 | 0.4.13 |
---|---|
The command used was the same in both runs:
./bin/cli.js e2e/screens --compare base:actual --open --threshold 0.03
In exchange, if we use --threshold 0.0045
on v0.4.11 it would fail similarly as v0.4.13, but definitely it's wrong:
The latest version is more accurate and does not convert the threshold in any manner, it also makes blockOut more reliable than before.
I'll deprecate the v0.4.12-14
and then I'll publish the package as v0.5.0
since this is a major change.
If anyone wants to continue using the old behavior and don't expect blockOut to workout on your tests then keep using the v0.4.11
version, otherwise please consider upgrading to the next version. Thank you all!
Awesome, that sounds good.
Hello,
thank you for maintaining this handy little tool. We have been using it for quite some time now without any problems.
Trying to update to 0.4.12 though leads to the following error in our pipeline:
As you can see, even though the differences are minimal, somehow they lead to a failure, since 0.03 should translate to 3% (which worked until 0.4.11.
Or did the semantic meaning of the threshold parameter somehow change? Then this should be reflected in the docs.