Closed Hermanoid closed 3 months ago
I would appreciate if you had consulted me first, since I am working on a new solver already here: https://github.com/OrderedSet86/flow2 I think we duplicated a lot of work. (You can message me at ".order" on Discord.) But I think we ultimately agree on three major points, which is really good:
I had some issues with trying to include loops in a LP problem solver earlier, which I detail here: https://github.com/OrderedSet86/flow2/blob/main/math.md My current solution that I have been exploring is an explicit equation solver using sympy again, mostly because it gives a more clear closed form solution using rationals. But a linear programming solution should be roughly equivalent.
For (1), I just added a new entry like in the image. Then while processing the yaml file I watched for v2_node_type
or m
to determine type. This is probably the best for backwards compatibility. I think there will be additional tooling required for these inputs because although there are fewer inputs, it is harder to come up with them a priori and the tool should guide you towards suspected required inputs. I have some ideas for this, like binary searching over subgraphs. Yes, we should continue using the project YAML file.
For (2) I think it can ignored for now, but a more fundamental rework of readability is needed soon. The current edges overlap way too much and unreadable stuff can happen often. In particular something that I think hurts readability a lot is same-rank horizontal dependencies, which you can see an example of in the above chart between Naq Stage 1 and Naq Stage 2. Rank can be manually specified when generating a graphviz chart, so I can write an algorithm that makes layout better for this (roughly speaking, distance from the source node). I am also partway through adding the quantities directly to the nodes instead of floating by the edges:
For (3), I am not seeing how this is difficult, but perhaps you are constructing your LP problem in a way that makes this difficult. I will look at the code now.
Although I am still maintaining flow v1, I would ultimately like to redo a large portion of the codebase. I think networkx is a significantly better choice for graph problems as it makes searching for ancestors/children and implementing common graph algorithms much easier than the ad hoc solution I am doing currently. You can see this if you look at the v2 code, it is significantly cleaner and easier to maintain.
Ah I see, your solver uses priority ratios: https://github.com/OrderedSet86/gtnh-flow/pull/47/files#diff-7ec0af41c2feb068c65f4ee2a9e4775a5e3e145d08656288b1d1cd13efabd5bbR106-R136
I don't think this will work for all problems due to the issues I describe here: https://github.com/OrderedSet86/flow2/blob/main/math.md#linear-programming-and-selecting-an-objective-function Furthermore, the failure modes are not that obvious and it will randomly break for the end user, which is a situation we want to avoid.
Ah, phooey. I looked all over at solvers for other games, but didn't think to check your other repo.
Fortunately, I do think this solution has a thing or two to add to the conversation. Firstly, from reading that math write-up:
input_amount = binary_switch * continuous_amount
) using CVXPY, but it didn't play well with its DCP ruleset (I struggle to understand why), and the other rulesets have other gotchas like strictly positive variables that make them not applicable. Perhaps someone smarter than I could figure it out or find a different solver. What I did figure out (while on top of a mountain in Italy, absent-minded thinking about this problem, talking to my sister and making fun of myself for struggling with this problem... it was a hilarious moment) is a sorta-hacky way to linearize the problem. Instead of input_amount = binary_switch * continuous_amount
, I use input_amount = (binary_switch - continuous_subtractor) * BIG_NUMBER
, where the big number is practically the priority ratio (because that is the largest amount of any ingredient we expect to use in this problem). See it in action here. I don't love it and there are situations where it might break down (if BIG_NUMBER isn't big enough). Check out this comment for why this might fail, and why I think it probably won't happen in not-contrived problems. Further, in my tests where I intentionally made BIG_NUMBER too small, I usually saw either a failure-to-solve or a "subtractor" of zero, meaning the solution used a maximal amount of the input. Both of these cases are detectable, which means we can report them to the user. I think this makes this solution "plenty good enough" for most cases. I'm certainly open to more ideas on this, especially better options than the "minimal set of inputs" approach. For example, running the solver on the platline with PMP as an input and platinum dust as an output, the solver would find that using zero PMP and shortcutting to a reprecipitated platinum dust input. Of course, this approach also displayed some unexpected "intelligence" by finding that it required fewer inputs to reprocess the byproduct calcium chloride and use the extra chlorine (along with some extra inputs) to produce some nice byproducts. It's not a bad approach, but it still requires the user to specify stuff we would like to be automatic, like simply acids and other inputs.
I do like the "maximize flow as a low priority" approach you mention in your writeup, and I think it might be a way to implement the "use as many of the provided recipes as possible" heuristic I was considering. I don't love that it will conflict with using the solver to choose the most optimal of multiple possible processing paths. But, honestly, I think it's very rare that you would want to use this solver for that. Perhaps I'm wrong,
You say "From my testing on flowv2 it reduces amount of user interventions by 8x on platline" - does this mean you have a minimal-set-of-inputs prototype out there?
Regardless, if you think this implementation holds some merit, I'm definitely up to contact you on discord and talk about integrating this with flow2 instead. I'm glad I stopped here to ask for input, before getting too deep into integration with this flow1.
Two other odd points:
Yes - please reach out on Discord. I am actively developing it and we would benefit from coordinating on development.
Yep, flow v2 has a MVP working. Here is platline. Unfortunately loops make the readability problem worse (look at the mess halfway down), and deciding "minimal set of I/O" is counterintuitive for large machine lines. Needs more time to cook before it is ready to replace v1 (the code will ultimately live here once it is done since this repo is much more popular / widely-known). But it works.
For broader design philosophy of the tool - I am not intending to pick the objectively optimal route between two methods. There are other tools that can solve that. Instead I leave the burden of machine line and resource selection up to the user, and instead show something that is (A) a useful and readable overview of a machine line, which can be used to build a line in the actual game, (B) an accurate predictor of overclock and machine count values, and (C) as a consequence of A and B, something useful for developer pack balancing. This is a complex enough problem and I am happy to leave eg. HOG fraction solving as an out of scope problem. I only mention this up front because you mentioned it here:
I don't love that it will conflict with using the solver to choose the most optimal of multiple possible processing paths. But, honestly, I think it's very rare that you would want to use this solver for that. Perhaps I'm wrong,
1 & 2) I am skeptical of any solution that requires numerical constraints on recipe I/O values. There is no guarantee that a machine line will not exist in the future that breaks this. For example, naqline has large input quantities and outputs tiny 4mb/s quantities. I think a "good enough" objective function can be found, but I am worried about it randomly breaking in a completely opaque way to the end user, and us not having a better answer than "tough luck." I would like to use a solver with no numerical sensitivities / heuristics if possible, and my v2 sympy solver does that. I appreciate the work you have done (it is impressive you made something this big in such a short time in an unfamiliar repo) but you are at the same spot I was when initially developing it - I looked into the pulp solver (same library even!) and ultimately discarded it due to too many annoying heuristics and issues needed to patch it.
I am theoretically open to basing the web UI on another tool, but I personally have some dislike of most JS frameworks (particularly React). It seems Satisfactory Tools is using Angular, which I do like more than React... We can likely steal whichever libraries they are using and integrate them into the final implementation. Really it comes down to whether it is worth it for dev time saving, as I would be happy with writing the web interface myself (assuming we choose a non-terrible framework). The existing one uses SolidJS which I like quite a lot, but I think is not in a state where it is ready for community projects - the UI library already got deprecated since I last worked on it. API calls are fine to me, the DB will be colocated on the webserver running WebNEI.
For the recipe exporter, nesql-exporter is the "official" one written by one of the dev team, so it is probably best if we coalescence around that one. I believe there is a fork owned by the GTNH developers (https://github.com/GTNewHorizons/nesql-exporter) so we can get permissions to manage commits to that one. The primary author is nice to work with but often busy, so we will have to do the legwork on getting it going. My understanding is the two main needs are more NEI handlers covered and faster recipe exporting.
Here's a very ambitious addition!
TL;DR: this PR adds a matrix-like solver to make problems with loops way easier to solve, and I have some questions about implementing the front-end.
This PR aims to add a Linear Programming solver alongside the existing Sympy solver, comparable to how the Factorio Factory Planner offers both a traditional and matrix solver. Just like Factorio, it's not practical to have a database of the value of all items in the game (beyond the base game, anyway), which makes solving these problems a challenge. However, I don't love the way Factory Planner handles this lack of information (by having users choose Free Variables), a solver that only needs a minimum of information on what's an output and what's an input. I also wanted something that can choose the most optimal of multiple processing routes - hence, Linear Programming solver.
As this is a big hunk of stuff to support, the first question is whether you (@OrderedSet86) are willing to incorporate and maintain this solver in your repo. If not, I can totally support it in my fork. If so, there is still work to be done with this code. But, since the backend solver is pretty much nailed down and reasonably well-tested (see my dev log on it), I wanted to open this up and ask for input on a few points on the front-end.
First, a high-level overview of my backend implementation: This solver is largely inspired by Kirk McDonald's Factorio Calculator write-up, but I had to make a number of modifications to handle the increased complexity of GTNH problems:
I chose CVXPY as the LP solver of choice for this problem because it can also handle the quadratic scaling problem, all in one shiny dependency.
Are are the design questions I'd like to discuss:
Let me know your thoughts!