erich666 / GraphicsGems

Code for the "Graphics Gems" book series
Other
1.39k stars 265 forks source link

AAPolyScan.c incorrectly draws top scanline of some polygons #32

Closed abainbridge closed 1 year ago

abainbridge commented 4 years ago

Imagine that you are trying to render a square that is, say 10x10 pixels, like this:

       8,8        168,8
         ___________
         |         |
         |         |
         |         |
         |         |
         |_________|
       8,168      168,168

These are subpixel coordinates, 16 times finer resolution than pixels. The the top scanline we draw will be half covered by the square, because the top of the square has a y value of 8, half way through a full pixel.

Line 99 initializes xLmax=-1 and xRmin=MAX_X.

During the main loop for the first scanline, xLmax is dilligently updated to 8 and xRmin to 168.

Then when we come to draw this scanline, consider a pixel in the middle of the section we need to fill. We will end up calling Coverage() passing in an x value of, say, 80. We then do:

int xr = x + 16 - 1; // equals 95
if (x>xLmax && xr<xRmin)
    return MAX_AREA;

The condition will be met and we will return MAX_AREA for this pixel. But we should not have done because it is only half covered by the square.

The problem is that we only consider the x coverage and assume that the y coverage is always full. This isn't a valid assumption for the first and last scanlines of the polygon.

This only a problem if the top edge of the polygon is exactly horizontal. If it isn't, xLmax and xRmin will end up set such that the conditional above does not evaluate to true.

I think this can be fixed by removing the _if (...) return MAXAREA; As a result, the slow path will be taken which will generate the correct result. A faster solution is more complicated and would require treating the first and last scanlines as special cases (I think).

erich666 commented 4 years ago

Thanks for reporting this, and a possible fix. I'll have to give this a careful look sometime soon. I just tried tracking down the original author, Jack C. Morrison of Evergreen, Colorado - not having much luck there so far.

abainbridge commented 4 years ago

Haha. I hope people don't try to track me down 30 years after I wrote some code. Although hopefully he'd be pleased that people still find his code interesting after all these years.

I think there's another related bug on line 109. There's a return statement with the comment "null polygon". We end up there when a horizontal edge is found at the bottom of the poly. Taking that return causes the function not to render the bottom scanline at all. I'm a bit less confident about this one. The problem is that my actual code has diverged somewhat from this version. When I find a bug in my version, I have to do quite a lot of thinking to convince myself that the same problem is present here.

I think by the time you add proper handling of the horizontal edges case, the code needs to change quite a bit, unfortunately. Maybe I'm missing something.

Let me know if I can provide any more info. I can share my code if you want, which includes a test harness. I should have my fixes done in the next day or so. Then you can find the bugs in mine!

erich666 commented 3 years ago

I'm (finally) trying to catch up with various Graphics Gems updates. I never did get an answer from Jack Morrison. I've commented out the two lines you recommended in your first post and pointed to this page for more information. I didn't touch the other potential problem you noted (yet), since you weren't confident at that point.

Feeling more confident now? I'm happy to trust that your code works; folding it in, and the test harness, would be good. So, please send them on, erich@acm.org

abainbridge commented 3 years ago

Hi Eric. I sent a few versions via email but haven't got a reply yet. So I'll attach two zips here in case my emails aren't getting through.

aa_poly_test5.zip includes the Graphics Gems polygon rendering code as well as my version. The test renders a shape using both, side-by-side so you can see the errors in the Graphics Gems version. Use the cursor keys and +/- keys next to backspace to control the position and size of the shape. As you move the shape, you'll see some rows of pixels go missing from the Graphics Gems version's output.

aa_poly_test6.zip has the fixes for the second bug. These are back ported from my version, making the minimal changes to the original Graphics Gems code. This isn't well tested. I guess the changes mainly serve to show more precisely what I think the bug(s) are.

aa_poly_test5.zip aa_poly_test6.zip

erich666 commented 1 year ago

Hi, I'm still alive. I entirely forgot about this one, in part because Github stopped sending me notifications some time ago. I've finally straightened that out and am going through issues in various repos. I'd like to close this one, but would also love to try your test programs out. They sound cool, actually - interactive! I'd like to walk through your aa_poly_test6 fixes, too, though I see those are fairly serious. Hmmm. What are your thoughts?

I tried building on Visual Studio 2022 (Community edition) and got a bunch of problems. The main ones I see after attempting to fix them myself are:

1>aa_poly_test.obj : error LNK2019: unresolved external symbol CreateWin referenced in function WinMain 1> Hint on symbols that are defined and could potentially match: 1> "int __cdecl CreateWin(int,int,wchar_t const *)" (?CreateWin@@YAHHHPEB_W@Z) 1>MSVCRTD.lib(exe_main.obj) : error LNK2019: unresolved external symbol main referenced in function "int __cdecl invoke_main(void)" (?invoke_main@@YAHXZ) 1>C:\Users\Eric\Downloads\aa_poly_test5\Project1\x64\Debug\Project1.exe : fatal error LNK1120: 2 unresolved externals

I attempted a fix of the CreateWin call to "int CreateWin(int width, int height, wchar_t const * winName)" and passed in a L"" string, but alas, no go. I'm the opposite of a Windows programming expert, so don't know what CreateWin now really wants to be. I expect you're linking to some library that provides it, but I'm not sure where to go. I'm more used to calls like

hWnd = CreateWindow(szWindowClass, szTitle, WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, 0, x, y, NULL, NULL, hInstance, NULL);

Here's the whole list of warnings and errors for aa_poly_test5, without my attempts to fix:

Build started... 1>------ Build started: Project: Project1, Configuration: Debug x64 ------ 1>AAPolyScan.c 1>aa_poly_test.c 1>C:\Users\Eric\Downloads\aa_poly_test5\aa_poly_test.c(25,20): warning C4244: 'return': conversion from 'double' to 'int', possible loss of data 1>C:\Users\Eric\Downloads\aa_poly_test5\aa_poly_test.c(148,51): warning C4244: 'function': conversion from 'double' to 'float', possible loss of data 1>Generating Code... 1>df_bitmap.cpp 1>df_common.cpp 1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(19,9): error C2664: 'int MessageBoxW(HWND,LPCWSTR,LPCWSTR,UINT)': cannot convert argument 2 from 'char [1024]' to 'LPCWSTR' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(19,26): message : Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or parenthesized function-style cast 1>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um\winuser.h(9161,1): message : see declaration of 'MessageBoxW' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(19,9): message : while trying to match the argument list '(int, char [1024], const char [14], long)' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(18,3): error C4996: 'vsprintf': This function or variable may be unsafe. Consider using vsprintf_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details. 1>df_polygon_aa.cpp 1>df_window.cpp 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(190,51): warning C4244: 'argument': conversion from 'LPARAM' to 'int', possible loss of data 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(190,43): warning C4244: 'argument': conversion from 'WPARAM' to 'unsigned int', possible loss of data 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(217,24): error C2440: '=': cannot convert from 'const char *' to 'LPCWSTR' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(217,24): message : Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or parenthesized function-style cast 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(237,19): error C2664: 'HMODULE LoadLibraryW(LPCWSTR)': cannot convert argument 1 from 'const char [11]' to 'LPCWSTR' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(237,31): message : Types pointed to are unrelated; conversion requires reinterpret_cast, C-style cast or parenthesized function-style cast 1>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\um\libloaderapi.h(665,1): message : see declaration of 'LoadLibraryW' 1>C:\Users\Eric\Downloads\aa_poly_test5\df_window.cpp(237,19): message : while trying to match the argument list '(const char [11])' 1>Generating Code... 1>Done building project "Project1.vcxproj" -- FAILED. ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ========== ========== Elapsed 00:02.207 ==========

At my current pace we'll wrap up this issue by the end of the decade (note I didn't say which decade). Honestly, I'd like to close this one up as we can. Having a clean polygon rasterizer that works properly would be good.

baifen commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。

abainbridge commented 1 year ago

Glad to see you're still alive :-)

I can't remember much of what was going on here. Give me a few days to re-familiarize myself.

erich666 commented 1 year ago

You, too! ;) Thanks for taking another look.

abainbridge commented 1 year ago

OK, I think I remember what was going on. I've just installed VS 2022 to reproduce the build errors you're seeing. (Gosh VS 2022 is horribly large and slow). My guess is that you didn't use the project file I supplied. You ought to be able to tell VS to open build/aa_poly_test.vcxproj from aa_poly_test5.zip.

If that's not working, here's what I think is causing the errors you're seeing:

1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(19,9): error C2664: 'int MessageBoxW(HWND,LPCWSTR,LPCWSTR,UINT)': cannot convert argument 2 from 'char [1024]' to 'LPCWSTR'

This one is because the project's character set is incorrectly configured. If you go to Project Properties->Advanced->Character Set, it should be "Not Set". I think yours is probably set to "Use Multi-byte Character Set", which is probably the default for a new project. This means that you build against the version of the Win32 API where all character types are some weird wide type. Changing it to "Not Set" means everything is based on "char".

1>C:\Users\Eric\Downloads\aa_poly_test5\df_common.cpp(18,3): error C4996: 'vsprintf': This function or variable may be unsafe. Consider using vsprintf_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.

This is where MS decided to not support the C standard. You can fix it in Project Properties->C/C++->Preprocessor->Preprocessor Definitions. Click "Edit" on the dropdown, and add "_CRT_SECURE_NO_WARNINGS" on a new line in the editable text box.

The other two errors are also caused by the character set problem.

erich666 commented 1 year ago

D'oh, thanks for the thorough debug tips. "You goofball, you didn't notice the .vcxproj in the build directory" is all I needed, it turns out.

Looking at the two code zips you included, I added your "aa_poly_test5" fixes in, three years ago: https://github.com/erich666/GraphicsGems/commit/15dce6b2815e2a772ed2aea5dd8719394c2f69e4

The "aa_poly_test6" code is considerably different than what's currently in the repo, then, and like you say it's for illustrative purposes. The problem is, I'm not seeing any real difference between the two <'s displayed when running your sample. Should I?

Could you perhaps look at the code (yours, mostly) in the repo currently and see what, if any, fixes are needed from your test6 code? If you have any specific "I'd put this fix in, it's well tested" fix you want to suggest, I'm game. I appreciate your caution in giving untested changes, though. If there's no proposed fix (or confidence that it's a solid fix), that's fine, I'll leave the bug open. A summary of "this is where we are right now" would be hugely helpful, and I'll then simply point at this summary from the code itself, for some future person to pursue.

abainbridge commented 1 year ago

D'oh, thanks for the thorough debug tips

:-)

I added your "aa_poly_test5" fixes in, three years ago

Ah, right. I had forgotten that.

The problem is, I'm not seeing any real difference between the two <'s displayed when running your sample. Should I?

Yes you should. Here's the important part of the output when the test is first run. The upper < is rendered by my code and the lower one by the graphics gems code (which includes fix https://github.com/erich666/GraphicsGems/commit/15dce6b2815e2a772ed2aea5dd8719394c2f69e4). The lower < looks like it is missing its bottom row of pixels. image

If you press the Down Arrow key about 8 times, you can make the situation worse as the missing row of pixels becomes darker in the correct version. image

Press Down Arrow a few more times and you should see this: image The < is formed from two polygons and at this point the bottom row of the top polygon is missing its bottom row of pixels. The vertex coordinates passed in to the render function are correct, ie there's no gap between the two polygons.

abainbridge commented 1 year ago

Could you perhaps look at the code (yours, mostly) in the repo currently and see what, if any, fixes are needed from your test6 code? If you have any specific "I'd put this fix in, it's well tested" fix you want to suggest, I'm game. I appreciate your caution in giving untested changes, though. If there's no proposed fix (or confidence that it's a solid fix), that's fine, I'll leave the bug open. A summary of "this is where we are right now" would be hugely helpful, and I'll then simply point at this summary from the code itself, for some future person to pursue.

I'll try.

erich666 commented 1 year ago

I'll repeat my comment:

The "aa_poly_test6" code is considerably different than what's currently in the repo, then, and like you say it's for illustrative purposes. The problem is, I'm not seeing any real difference between the two <'s displayed when running your sample. Should I?

I can see the differences (easily) with aa_poly_test5, but test6 doesn't show me anything different.

abainbridge commented 1 year ago

Sorry I misread your comment. Let me try again...

No, you shouldn't see a difference between the <'s in test6. It just shows that my changes were sufficient to fix the remaining problems I was complaining about.

I think all my changes in aa_poly_test6 should be applied to the GGs repo - over the last three years of using my own similar code, I haven't found any problems. However, I'd still like to do three things:

  1. Review my own changes - 3 years should give me a fresh perspective.
  2. Improve the test to include more kinds of polygons.
  3. Add a performance test to see if my changes cause a significant slow down.
abainbridge commented 1 year ago

Sorry for the delay, I've been ill for a bit.

I have discovered a bug in my test6/AApolyScan.c that isn't in the test5 version. If you pass in a polygon where all the vertices have a negative y coordinate, then it produces a corrupt output.

It is really quite fiddly. More thought needed.

abainbridge commented 1 year ago

I've now got some changes to the GGs code that are less extensive than my previous attempt (aa_poly_test6) and also don't introduce the new bug that my previous attempt did.

I've put a lot more effort into testing now. I now realize that the Gem doesn't fully define what kinds of polygons are supported. It says, "It is assumed that the polygon to be converted is convex, and that the vertices are consistently ordered (for example, clockwise)". Problems with this:

  1. There are many question about the definition of convex: a) what if two vertices are at the exact same point, b) what if all vertices are at the exact same point, c) what if two edges are co-linear (Wikipedia says polygons with no co-linear edges are "strictly convex"). I think we should support correct rendering of all these cases. They all tend to happen as you scale a polygon down from some higher resolution source and quantization errors result in merging vertices that were previously distinct.
  2. "consistently ordered (for example clockwise)" is too weak a statement. The GGs code requires that the winding order is clockwise if you define your +ve y-axis to point up, or anti-clockwise if you define it to point down. It might be nice to say this in the text.
  3. What should happen if you pass in a polygon that doesn't meet the requirements? It would be nice if we guaranteed to render nothing, not crash, and not hang. But doing so requires adding a bunch of geometry calculations at the start of the function so we can early exit if a bad poly is detected. The optimal code for this is not obvious. The only correct code on Stackoverflow is horrendously slow because it involves calls to atan2(). https://stackoverflow.com/questions/471962/how-do-i-efficiently-determine-if-a-polygon-is-convex-non-convex-or-complex/45372025#45372025/. I think I have a better solution, but I'm sure this is a well studied problem. I'd like to receive the received wisdom :-)
  4. The GGs code doesn't actually require the poly to be convex. It works fine for a polygon like the one below. I think the requirement is that every horizontal line through the poly intersects two edges. But I haven't had the energy to think through the corner cases of vertices on top of each other etc. Again it'd be nice to say this in the text if we're confident. image

I'll keep going with my testing for now and report back soon.

erich666 commented 1 year ago

Wow, good analysis. And, right, handling the various corner cases looks like something they didn't try to handle. If you want to take a crack at it, great. I might just make it a separate file that's included in the repo, so that the original isn't changed too much. And, yeah, avoid atan2() - that's sad.

Right, typically scanline algorithms require that there is a single left and right edge, like you mention, and that they don't cross, etc. So concave polygons are fine, like you show, if oriented properly.

Graphics Gems has code for testing if a polygon is convex or not, see https://www.realtimerendering.com/resources/GraphicsGems/gemsiv/convex_test/ - not sure it's a help, but it's the one I know. Send me an email (I'm at erich@acm.org) and I can send on the article itself, if you don't have Graphics Gems IV handy. It's a quite robust and thorough solution (I remember offering my own algorithm on USENET during this discussion, but mine - and many others - failed certain robustness tests).

abainbridge commented 1 year ago

Graphics Gems has code for testing if a polygon is convex or not

Hah. Thanks for the link. I have GG IV. I'm not sure why I didn't think to look there. It's a great source and covers the various types of degenerate polygon well. However, I've come to the conclusion that adding an "is convex" test to the start of drawPolygon() is a bad idea. It was useful though - I needed a convexity test function so I could fuzz test with random polygons and filter out all the non-convex ones.

Moving on to my next proposed bug fix. The reprocase is almost any polygon. The bug is that the last scanline of the poly isn't rendered. On Aug 26, 2020, I said that the bug happened when the bottom scanline was horizontal. That was misleading - the bug is more general than that. I'll illustrate with a diamond shape.

The wrong logic hits when we're processing the final subpixel row in the main loop of drawPolygon(). Here's a picture showing which vertices vLeft, vNextLeft, vRight and vNextRight point to when we start the final iteration of the loop. The current subpixel row is highlighted in red. image

The first code in the loop is while (y == VnextLeft->y). The condition is true, so we move vNextLeft on to vertex 3 - ie the left side we're walking is about to start going backwards up the right side of the poly. The next statements are, if (VnextLeft == Vright) return. The condition is true, so we exit the function. We should not exit because this scanline has a bunch of partially covered pixels in it that we haven't rendered yet. Instead we should break from this while loop and then do the "done, mark uncovered part of last scanline" code below, which already includes a return.

In all my testing, that "done, mark uncovered part of last scanline" code block is never executed. I think I've convinced myself the code is unreachable because of the if (VnextLeft == Vright) return above. Once I replaced the return with a break it is called but renders the last scanline one pixel too low. That is because the for-loop setting all the sub-pixel extents to -1 had the side-effect of incrementing y. So we need to add a -1 in the renderScanline() call. Because of the off-by-one bug here, I'm even more confident this code was unreachable in the original GG version.

One more change is needed. The if (VnextLeft == Vright) return that I removed had a comments saying all y's same? and (null) polygon. With it removed, if we pass in a poly with the same y value for all vertices, the function no longer terminates. The simplest fix for this I can see is to store y before the main loop (ie the minimum y of the poly), and change if (VnextLeft == Vright) return to:

            if (VnextLeft == Vright) {
                if (y == minY) /* all y's same?  */
                    return; /* (null polygon) */
                break;
            }

I guess this is a new issue so we should close this one and I'll create a new one.