SmilingWolf / VQMTCrossVal

Cross validation for objective quality metric measurement tools on multiple public datasets
4 stars 2 forks source link

instructions on how to reproduce the results are missing #2

Open jyrkialakuijala opened 4 years ago

jyrkialakuijala commented 4 years ago

please consider adding instructions on how to reproduce the results

jyrkialakuijala commented 4 years ago

also possibly separate instructions without dssim since not every organization has a license to use it

SmilingWolf commented 4 years ago

Software used:

VQMTs:

What I did was:

All the scripts used, start and finish CSVs are made available within the Scripts subdir of this repo. They should work flawlessly within a Windows or in general a case-insensitive env, while some more care should be taken on Linux because of some case mismatching/inconsistencies between the image name in the datasheets and the effective filename.

Scripts follow a [dataset].["getScores"/"ROCC"].py naming convention, where getScores reads the starting CSV ("[dataset].MOS.csv"), runs the tools, parses the stdout and adds the result to the final CSV ([dataset].VQMT.csv). The ROCC scripts read the [dataset].VQMT.csv CSV and use pandas+scipy to calculate the correlation coefficients.

As a side note, the scripts within the Scripts subdir display some more results: where possible I added some filtering by distortion to measure correlation only for JPEG+JPEG2K and JPEG+JPEG2K+Gaussian blur distortions. Didn't include them in the "frontpage" for the sake of brevity, but thought you might have a use for this.

If anything is unclear, I am at your disposal to amend the above so that the repro instructions can reach a presentable state and be included in the README.md file.

jyrkialakuijala commented 4 years ago

Would you be able to easily rerun the results with https://gitlab.com/wg1/jpeg-xl/blob/master/tools/butteraugli_main.cc instead of https://github.com/google/butteraugli

using the p-norm score, not the max butteraugli score? (The jpeg-xl butteraugli prints out both scores.)

SmilingWolf commented 4 years ago

I would and already began doing so for max, 2, 3, 6 and 12-norm scores on the JPEG XR and KADID-10k datasets. I can do them all and post the results here.

jyrkialakuijala commented 4 years ago

That is just wonderful. I cannot thank you enough.

If it doesn't work out, I need to do some serious home work.

jyrkialakuijala commented 4 years ago

Please note that JPEG XL butteraugli prints out a mixture of p-norms, i.e., when you ask for a 3rd norm, it prints a mix of 3rd, 6th and 12th norm. Just good to know in case if you want to compare against p-norming the other metrics, too.

SmilingWolf commented 4 years ago

Could you please elaborate on that? This is my very trivial patch for butteraugli_main.cc:

diff --git a/tools/butteraugli_main.cc b/tools/butteraugli_main.cc
index 99bac01..cf404f7 100644
--- a/tools/butteraugli_main.cc
+++ b/tools/butteraugli_main.cc
@@ -83,10 +83,22 @@ Status RunButteraugli(const char* pathname1, const char* pathname2,
                                              kHfAsymmetry, &distmap, &pool);
   printf("%.10f\n", distance);

-  double p = 3.0;
+  double p = 2.0;
   double pnorm = ComputeDistanceP(distmap, p);
   printf("%g-norm: %f\n", p, pnorm);

+  p = 3.0;
+  pnorm = ComputeDistanceP(distmap, p);
+  printf("%g-norm: %f\n", p, pnorm);
+
+  p = 6.0;
+  pnorm = ComputeDistanceP(distmap, p);
+  printf("%g-norm: %f\n", p, pnorm);
+
+  p = 12.0;
+  pnorm = ComputeDistanceP(distmap, p);
+  printf("%g-norm: %f\n", p, pnorm);
+
   if (distmap_filename != "") {
     float good = butteraugli::ButteraugliFuzzyInverse(1.5);
     float bad = butteraugli::ButteraugliFuzzyInverse(0.5);

Can you confirm or deny that this prints out the 2, 3, 6 and 12-norms before I go full steam ahead with the remaining datasets?

jyrkialakuijala commented 4 years ago

What you have are not pure p-norms, but always producing a mixture of three p-norms. The ComputeDistanceP function internally doesn't compute a p-norm, but the average of three p-norms: p-norm, (2p)-norm, and (4p)-norm

Might be that the mixtures work actually better, so it is ok to start with what you have.

... For simple p-norms you'd need to remove stuff from butteraugli_pnorm.cc

The lines:

v += pow(onePerPixels * (sum1[1] + GetLane(hwy::ext::SumOfLanes(sums1))),
         1.0 / (p * 2.0));
v += pow(onePerPixels * (sum1[2] + GetLane(hwy::ext::SumOfLanes(sums2))),
         1.0 / (p * 4.0));
v /= 3.0;

and change

for (int i = 0; i < 3; ++i) {

to for (int i = 0; i < 1; ++i) {

and remove one more line: v /= 3.0;

SmilingWolf commented 4 years ago

That makes much more sense seeing the results I'm getting from a few initials runs. Thanks!

Maybe I could make a new function, ComputeSimpleDistanceP(...) with the above changes to get pure norms, and include the results along with max and the standard (mixed) 3p-norm. Funky mixes can wait until I have at least the basics covered.

jyrkialakuijala commented 4 years ago

Perfect!