Closed controllerface closed 3 months ago
While I know Vulkan is the new hotness, I have been really pleased with how the OpenGL renderers have come together. I knew going into this I'd be using "old tech" and I was concerned there would be things that I wouldn't be able to do. Sure, I can't use RT hardware from GL, but that's not even something I care about for the type of game I'd want to make.
When I started this, I figured by the time I got to this point (i.e. contemplating a v.2 codebase) I'd be ready to go over to VK, but honestly I don't think it's really necessary and would only hinder my progress at this point. I have been studying Vulkan a lot while working on this project, and I do have a much better understanding of it now than I did before, especially on the compute side of things. But now, I feel like it would actually be better to have everything actually polished and then move to VK if I ever feel it is necessary.
I will continue to use the more modern APIs in OpenGL and also keep my OpenCL kernels structured so they have correctly defined work group and memory sizes, since this would make any eventual porting a lot easier. It is also just good practice and makes it more explicit in code what is happening o the GPU side of things.
In a new codebase, I may tweak a few things, perhaps group certain classes and interfaces a bit differently, but overall I like the way the classes have come together. After iterating now for several months, I feel that the vast majority objects and concepts that should be together, are together and things that should be separate have clear boundaries. I might want to do a little bit of consolidation on some GPU related stuff, and well.. get rid of some "global" classes (to make testing easier) but I don't think any new project would look drastically different from the current one.
Again, there's some stuff I would change here, but I am really happy with the level of performance I've been able to reach for my first real attempt at a physics simulation. I won't say too much else here, because there's some isolated things I do want to change, but I have gotten further with this than I had hoped and I am really looking forward to iterating and optimizing it more in the future.
I was a bit reluctant to even add this, but once I did I found it was quite handy and I was actually mad I didn't add it from the beginning. There's a lot of stuff I implemented with tight coupling that would have been much better as an event. Being async, of course there's some things that wouldn't really work with it, but a lot of tasks don't need to be synchronous and I don't even know how I could have made a UI work without it.
I dragged my heels getting this in place too and I should have done it sooner. It's very basic and deserves a lot more polish than it got, but even as-is it's super helpful and I would want to expand this out quite a bit in a new project. This is also a perfect example of a component that should have been tied into the event system from day one, it already uses an event system to relay information to the browser and it would be a perfect fit to connect these two systems together instead of having them as separate features.
I am calling this out separately because I was really happy with how this one came out. I didn't flesh it out into a fully functional HUD, but getting all the text to render in a single render call using indirect rendering, and wiring it up using the event system to ensure it only re-calculated when needed made this feel really nice. This was the last renderer I wrote and I think it shows, as it feels like the most solid and tightly implemented of them all. I absolutely would need more features to make it great, but the core just feels really solid and performs incredibly well.
While I did find the renderers quite easy to build, I think in a new project I would consolidate them into a larger "Render System". The physics system has a number of isolated, but related concepts and they are a lot easier to tweak, optimize, and re-factor because they are grouped together. In a new project, I would apply this same design to the rendering process. One thing I can think of right off the bat is I would be able to do some of the "setup" stuff across the different types (for example, models, liquids, debug renderers) in a more parallelized manner. There are certain aspects that do need to be serial, but some that wouldn't have to be and I could improve performance a bit by leveraging that. It may also be possible to add some internal synchronization so the serial bits could at least be interleaved to more efficiently utilize the hardware.
Some number of util classes are always going to be needed for any project, and they do have their place. It did seem like a sensible approach to split out CL/GL utils since they are for different purposes. However, I think looking back I really am using OpenCL and OpenGL so tightly together, that they are effectively just different parts of the same API. Also, while the GL stuff doesn't have this issue, the CL utility code is kind of spread out across the actual CL utils class and the GPGPU class, which I often find cumbersome.
In a new project, I would consolidate GL/CL into a single set of utils and classes. Not only would this better reflect how I actually use them, it would also align much better with Vulkan. While I am still in no hurry to actually port things to VK, it is a much more sensible way to think about things IMHO, where you have your "GPU stuff" all in one place. And if/when I finally do want to move to VK, it will create very obvious "dividing lines", which makes porting a LOT easier in my experience.
Also, I was not as rigorous in doing error checks in all of the APIs as I should have been. I did start going back and adding error code checks later in the project, but I should have done this earlier and added specific check()
methods and used them on every call. This would make the actual error checking code more centralized and also just reduce the amount of code as I wouldn't have to repeat the error check logic inside every utility method.
Overall I actually am super happy with how far I've gotten with the animation process, especially blending and layers. I certainly have some rough edges there, but even that isn't a huge concern to me. However, I took some shortcuts in the way I handled timing, in particular I made no provision for non-looping or pausable animations, both of which would come in handy.
Because every animation is assumed to loop, I had to hard-code some animation times into the state machine. It is nice to be able to override times like this if needed, but it would be even more nice to actually let those animations play, end, and automatically transition to whatever the next state should be. This would make the jump wind-up and landing animations easier to work with.
While I didn't quite get to implementing the ability to walk up and own stairs, doing so would be really difficult if your character's feet snapped into an idle pose when you stopped moving while ascending/descending. I would absolutely want some way to let an animation remain paused at some frame, say for example, when on stairs and the player stops pressing the left or right movement key, cycle until both feet are touching the ground and then pause until they move again.
Also, I moved all the logic into the CPU side because I wanted it to be easier to tweak, and for the state transitions this did help me iterate on that. But once I had the layout figured out for layering, it provided no real value and has a small, but noticeable impact on overall performance due to how much data has to be read from the GPU each frame. It would appear that pushing data from the CPU to GPU is overall more performant, and because the movement data can be packed into flags, I can really compress what data I do need to send. I think in a new project, I would move back to doing things on the GPU and just write a kernel that is used only for the player. This would move back to a system where I would have lookup tables being generated, which can be a bit more difficult to reason about since they are generated, but I have some ideas to help make that less of a concern too...
I was really keen on generating as much GPU code as possible, and I absolutely still think this is a good idea. However, I think the way I do it now is less than ideal, primarily because the code being generated is "invisible". So for example, I generate a bunch of #defines
to make Java-side enum
s accessible with the same names on both sides of the CPU/GPU boundary. But since the code is generated at runtime, the kernel code itself has tons of errors in my IDE because I'm referencing constants that aren't really there until runtime.
I think a better way to do this would be to have a build step that generated the code as actual files, and then use that content directly in the kernels where needed. This would mean that some kernel programs would have duplicate data that other programs have, but in reality this is exactly how it works now, it's just hidden from view. I think it should be possible to wire up some kind of special comment start/end blocks or some similar process so that existing files can be scanned and updated in a separate code-gen step, which is only run when needed, and then kernel files can be written and maintained in a much less janky way.
I may also want to re-evaluate how kernels are laid out in files. I initially separated several functions into separate files, which I do think makes sense logically, especially for re-use purposes, but it presents the same problem as the "invisible" #define
s for enum constants. I should just use a similar code-gen step to drop these directly into kernels. In addition to that, while I actually really like being able to implement multiple __kernel
s in a single "program" this is another thing that doesn't really conform to how Vulkan works, and it may be prudent to use a single program/kernel layout. If I actually start generating/inserting code into the program files, this will not be so bad as I will still have the ability to group related functions together. I can also create a different folder structure so related kernels are also in the same folder, which would make maintenance a bit easier too.
While I mentioned this component as a "good", I spent almost no time on the UI, it's super basic and the data could be presented in a much more consumable way. I have made way more complex Web UIs and I have a decent chunk of already built components I could use to make it much nicer looking. In a new project, I should really put in the time to make it more polished and usable. I never even got to the "editor" parts of it, which really would have come in handy. I should be able to treat it the same way many folks treat IMGui, but purpose built for my uses and without the extra dependency bloat.
In addition to the lack of polish, I also would want to have a better setup for emitting data to the debugger event stream. The current layout that requires adding the event name into a JS file is really not necessary, I could very easily send events in JSON and have them get logged and just add appropriate components to handle logic as I see fit, with a basic fallback component for generic display that is "good enough" for anything that doesn't have a specific component yet. Then, I could define a handful of different types of events, which could be routed to appropriate tabs.
While I was really happy with how the text rendering implementation came out, I do think I should have spent a little more time writing some basic UI components to implement the HUD. I should have added things like color for text and some form of icon display, and simple tabs. Long term, I know I will end up with a decent amount of information to present to the player (though i want to be careful to only show what is really necessary) and that will require a bit more fleshed out system. I should make sure in a new project, to plan this out a bit better, and try to work in display of data earlier in development.
Also, I need to make sure there's a way to interact with HUD elements via the mouse and keyboard. I will at least need some kind of search box and/or filter because the number of elements, compounds, etc. is going to be quite large for the game design goals I am going for. I don't need to have this fully formed from the jump, but I probably should iterate on this alongside other game features so they are cohesive.
I iterated quite a bit on the buffer objects and overall I am quite happy with the grouping classes and the resizable wrappers. However, I should have put in the time to integrate these with the standard JDK Cleaner
mechanism, so they would get automatically cleaned up. This would make these objects much less fiddly and work like other Java objects do. I am sure there are probably a few places in the code right now where I am forgetting to clean up a buffer or two. It generally doesn't matter because they get destroyed completely when the program exits, but it will be required for testing to work properly. And I may want to allow for some amount of spinup/spindown of buffers at runtime in the future. Especially for things like world transitions.
When adding this, I should also extend it to GPU kernel objects that need to be released, like programs and kernels. Basically anything that wraps a raw pointer and needs to be deleted after use should be wired up to be automatically cleaned.
In general, I am quite happy with the sector loading/unloading process. While I did have a few hiccups getting it in place, after some thorough play testing, I was able to really nail down the process and make it work. However, one thing I could never quite get working was allowing sectors to load one at a time. Because of this, when sectors would load/unload I had to process them all in one fell swoop. For unloading, this was actually not a big deal and the process didn't really cause any issues. However, for loading, it led to a slight pause when scrolling the screen left or right. I would have really liked it if I could have sectors "stream" in one by one. Even if this would make it so the sectors on the edge of the screen could be seen loading if moving fast enough, I wouldn't mind. It would be worth it for the improved responsiveness.
I am not 100% if this is going to actually be doable to be honest, because there's weird corner cases that could arise, like if a sector is queued to load, and then the player moves back the other direction causing it to no longer need to be loaded. It will take a lot of testing, but if I can make it work I think it will be well worth it, especially for lower end hardware like my laptop.
The model loader code is more or less fine, but I precluded being able to load boneless (i.e. non-animated) models, and I also had a very rigid requirement that models had a 1:1 mapping of named meshes to bones. This would seriously hamper flexibility in the future as it would prevent having more traditional model layouts where a single mesh is modified by multiple bones. It also kind of stupidly prevents using models for basic static objects, which is really quite limiting.
I have written almost no tests at all, and I know this is bad, and I should feel bad 😆. The few tests I did write were just to visually confirm the generated GPU code I made was right, but I really should have written at least some cursory tests for the kernels themselves. For the most part, if I made a huge error in a kernel, it just wouldn't compile, which took care of most of the goofs I made, but there were definitely a few cases where I passed incorrect data types, but didn't even realize the mistake until way later, like when I expanded some buffers from int
to int2
or int4
or vice-versa. A few times I used only the x
component of a vector so I just kind of treated it as a scalar, but was wasting space with extra data. But more importantly, I never had any safety net if I added a logic bug in a kernel due to adding a new buffer or changing how an existing buffer was used. I certainly had a few bugs that would have been found much quicker if I had tests.
Related to this effort, I should have made the GPGPU class non global. I am not someone who dogmatically refuses to use globally accessible classes, I actually think they have their place, for example I think it's quite silly to pass around a Window
object to every single class that may need it. It's much easier to make that a singleton, and just have very clear setup/teardown methods so it can be used easily in testing where needed. But for pretty much everything else, it really should be the case that classes which need the functionality are passed a class that provides it. I have implemented systems like this in my professional career that work extremely well and are very easy to reason about and test, so I have no excuse not to do the same here.
This one I only recently realized was a huge mistake and not because of the grid concept, it's actually vital to the physics system, but the way i tied it to the screen dimensions instead of the world dimensions was a HUGE oversight and I should have thought of that much sooner. Trying to remedy it after all the code relies on it is basically a fools errand. In a new project, I need to make sure the grid dimensions are defined within the same world space as the objects in the world, and scaled to fit them, not the other way around. This was probably the biggest fundamental implementation error I made in the whole project.
In a new code base, I may also want to implement the grid as a fixed layout from the get-go, instead of moving it around every frame, if possible. The logic for physics should be written to translate positions to and from the grid origin from the initial implementation, so the floating point imprecision problem is mitigated before it can even manifest.
I made a bad assumption that I could use circle hulls for water, polygon hulls for everything else, and that I could just figure out this whole "liquid sim thing" once I got the rest of the simulation to a good state. This turned out to be a lot more of headache than I expected. I should have went with my gut and had a separate particle system for water that could be affected by polygon hulls, but was distinct. This would have made it a lot easier in the long run, even though it is more complicated to implement up-front, since there would need to be two systems.
What I should have done was implement the fluid simulation first, make it look decent, and then layered the polygon system on top of it. I don't even need a crazy realistic simulation, just something as good as the quick JavaScript one I whipped up in an evening via a youtube tutorial. If I can keep track of water particles as points, I can easily just have a "shadow" hull tied to that point which would allow the water particles to interact with other hulls. Yes, it means liquids would require carrying around a little extra data that other hulls wouldn't need, but that's a stupidly small price to pay for the much better visuals a good liquid sim would provide.
A couple of places, most notably the jump implementation, use a tick-based system where things occur after some number of ticks. For example, the player can jump and continue to accelerate upward for some number of physics ticks, until they run out of the "tick budget" and start to fall back down again. This kind of approach is not ideal because it's much more difficult to reason about than simply having a time budget instead. The player should ascend for some period of time, after which they stop ascending and then descend until they hit the ground. The mechanism works exactly the same except it uses a more sensible unit of measure. This also ensures that regardless of the simulation sub-step, the jump amount is exactly the same since it would be adjust by the delta time value, instead of the tick.
This one is perhaps not super bad but I'm putting in this bucket because it feels a bit like a structural issue and might be worth a revisit in a new codebase. Specifically, the way constituent components of entities are laid out can be a bit annoying. An entity contains one or more hulls, and hulls contains one or more points and zero or more edges (circles have no edges). This creates a tree-like structure with points having direct references back to their parent hull, hulls to the parent entity, etc. (and this is glossing right over bones, which may or may not be present too). I know from past experience and from working with them on this project, that the tree paradigm, while nice in some respects can make it more difficult to deal with data in some cases.
The biggest way this causes issues is with compaction and when moving entities into/out of memory. Because objects contain direct references to their parents, when buffer positions change, all those references need to change as well. However, if instead, objects used relative offsets, with the parent entity as the "single source of truth" then only the entity object itself would need these updates. Of course, some objects (in this idea, the entity) needs to have an maintain a direct reference, but it's a lot easier to do that on one single object than to cascade it to all children as well.
Instead of the current design, I could try making it so the entity itself actually "owned" all the constituent components. So the entity would maintain hull, edge, and point tables. Then hulls would have a reference to their parent entity (which is already the case) and their point tables and edge tables, instead of being direct references as they are now, would be relative offsets into the relevant table for the entity. And the same would be true for points, which could have a reference to their parent entity and a relative index into the parent's hull table in order to find the hull they belong to.
In this design, the constituent objects do still need that one direct reference to the entity, so for compaction and world egress/ingress it would still need to be adjusted, as needed. But doing so would be significantly more straightforward than the current design, which requires a lot of work to ensure things are adjusted properly, and the logic is somewhat obtuse and error prone. This new design would require some extra data to be associated with objects and of course requires more buffers passed to kernels, and more steps to do a look up, but it could be worth it for the removal of a lot of bug-prone logic.
Decided to a little code inventory today and I've started a new blank project, stubbed a few things, copied over some of the "scaffolding" stuff. Just going to take a few days to test the waters a bit and see how it goes. All in all, while I have spent a good chunk of time in this codebase, it's honestly not a huge amount of code tbh. But the mental break of a clean slate is a bit nice.
Just a few days of tinkering and I already am happier with the way the direction the new codebase is going in. It's not drastically different but just with a clear head, I can see all the rough edges so much clearer and it is honestly just refreshing to be able to move things over and correct the little things one at a time, which had kind of piled up over time.
So Its safe to say I am moving forward with this new code now. I will probably check in some minor things, as I've been "porting" over to the new project there's been one or two really small things I have actually changed in the old one, just to make the copy/paste process go a little smother, and fix a bug or two. But for all intents and purposes, I am considering this prototype project "done". I will leave it as-is for reference, and probably update the readme to indicate that I am no longer working on it. Not that expect anyone is really going to notice, lol.
I have learned a ton working on this, and I'm really looking forward to the next iteration!
Closing this out now, the new codebase is up and running, I've fixed the scaling issues and and working on pulling over the physics code now.
I am keeping the new code private, since I will start adding some actual artwork instead of the placeholders used in the project, and well.. I kind of want to start making this into something a little more "real".
It's been a great learning experience, maybe this repo will help someone out someday.
Over the past few months, I have really done a lot of work and figured out a lot of concepts I had been wanting to tackle for a while. At the same time, I have run into several issues that have been difficult to work around, and as it's been a year+ now that I've been learning about all this stuff, I certainly can see some mistakes that I've made.
Where I can, I've refactored to fix stuff and the overall layout of the project has gotten a lot better. But there's some fundamental issues that make it really difficult to refactor, and now realizing that I've been working at 100x scale on objects this whole time, it starts to look more and more like I'd need to rewrite huge chunks of code to fix issues. If that's really where I've gotten to, then I think it might be time to actually move on to "version 2" of this experiment, taking the lessons learned here and applying them in a fresh project.
From the beginning, I expected this would eventually be the route I'd go so this isn't really a surprise. In fact, I've been feeling the itch to start fresh for a bit now, if not for any other reason than to just get a clean slate. I will use this story to write down thoughts on what worked and what didn't (and anything in between) so I can have a single reference for any new project I might start. I guess I could consider this a possible post-mortem.
I will probably edit the following comments a bunch as I collect my thoughts on the different aspects of the codebase.