Open alexjbest opened 4 years ago
I think the easiest way to resolve this, would be to look at these particular examples, and then modify our script until it makes something more reasonable. Then we can just regenerate all the data.
Of course, this might not solve all problems, but I don't see an easy way to check all images, unless someone is very good with machine learning or something like that...
I don't think there are too many distinct pictures, so I was thinking it wouldn't be too hard to scan through them all by hand, if only one could see them all in one place. Because they are all stored in the database as embedded PNGs (if I remember correctly) it's nontrivial to just look at them, making a fake webpage with them all rendered would work though, or dumping them all to a directory of PNG files somehow.
FYI, at least 2 million more genus 2 curves that will be added to the LMFDB this year, so I would look for an automated solution.
I don't think this process is very easy to automate. There will always be something special happening in some specific cases that will mess up the picture. In the end, either someone or some very clever machine has to look at all the pictures to see if they are okay. Of course there will not be 2 million pictures, there are many many collisions. At some point, we could also say that we are fine with a very small proportion of pictures being a little bit broken.
writing some code to make sure that the neighbors of black pixels are only white or black should be easy, and thus detecting the current issue. However, writing code to create a better thumbnail might be harder. Perhaps removing the numbers from the thumbnails would make life easier? Or just creating a larger image, where it's easier to assure that there is no collision, and then rescale it? or perhaps use SVG?
Also, what is the root reason for the thumbnail? Desired different aspect ratios?
Of course we could remove the numbers. Writing this code that you propose with the pixels sounds like more work than I find the problem to be worth. The full versions of the cluster pictures that Alex is referring to both look fine. Regenerating the thumbnails is relatively easy, I can just change some parameters in the LaTeX code to change the spacing. The reason for the thumbnails is that a scaled down version of the full cluster picture does not look good. The text becomes illegible and the vertical spacing is not good.
In https://beta.lmfdb.org/Genus2Curve/Q/630/a/34020/1 the first cluster picture under local invariants has a dot overlapping the border of the cluster.
Likewise for https://beta.lmfdb.org/Genus2Curve/Q/2520/c/680400/1, so we should probably find a way to check all thumbnails.