Closed anayden closed 1 year ago
@anayden π Very cool stuff!
To test the new tests I checked out the match_svg_via_images
branch, modified the Dockerfile
to use FROM rnacentral/r2dt-base:match_svg_via_images
, then built the image for linux/amd64
and ran tests with r2dt.py test
on the 1.4 library:
docker run --platform=linux/amd64 -it -v ./:/rna/r2dt/ -v path/to/1.4/:/rna/r2dt/data/cms rnacentral/r2dt:match_svg_via_images
root@bab12df6e66b:/rna# r2dt.py test
...
Ran 22 tests in 1105.495s
FAILED (failures=10)
Hopefully this what the right thing to do π€
Lots of tests failed which is expected, but to fully understand the new approach I wanted to use file TestSingleEntry_0.html
as an example and confirm whether it is supposed to be there or not? Or does its presence have something to do with the requirement for the file sizes to be the same?
I would've thought that these 2 files should be considered similar enough β’οΈ not to fail the test:
I am curious to see an example that used to fail but does not fail anymore. To check this I am going to re-run on develop
and compare the number of html files to see which images are not failing the tests on this PR branch. Or I could also change tests.py
and run both filecmp.cmp(new_file, reference_file)
and self._are_identical(reference_file, new_file)
and report the differences, but I figured that you probably did something similar?
Once again thanks for your titanic efforts! πͺπ»
@AntonPetrov thanks for the comments and review!
You've done the testing correctly process-wise.
I would've thought that these 2 files should be considered similar enough β’οΈ not to fail the test:
Somehow I didn't see this exact pair of images in my local testing. Could you share the filename? Either way, if 2 images have different sizes in pixels (by any reason), we can't compare them in a meaningful automated way. The only option would be to leverage computer vision and ML, but that can easily will make the tests slower than the app itself βΒ and it's definitely way beyond my knowledge of image comparison algorithms. Basically, if sizes are different β we assume the images are different.
In order to simplify visual testing and comparison, I've made few changes in the process.
bbox
is not used and is there just for reference.Using this information, feel free to adjust the similarity threshold if needed.
Also, looking forward to more feedback w/r/t test correctness.
Some images are indeed different, so not 100% of the tests pass, but this is expected.