KirkMcDonald / kirkmcdonald.github.io

Simple web-based calculator for the game Factorio.
Apache License 2.0
540 stars 148 forks source link

Evaluate Bob's Mods #35

Open KirkMcDonald opened 7 years ago

KirkMcDonald commented 7 years ago

Bob's (plus Angel's ores, and maybe some other mods) significantly expands the recipe graph, and poses a significantly greater challenge for modeling than the vanilla game does. Investigate how far this calculator can get when the Bob's data is thrown at it, and how much work it would take to make it a practical tool.

crixomix commented 6 years ago

I would love you forever so much if you added bob's recipes in here. I personally don't do any of angels stuff, but that would indeed make it even more complicated.

Bob's I feel is still pretty straightforward, though there is often times multiple ways to get to the same item which would be tough to decide. Though it could be like the machine choosing now. If you're going to make brass, you'd just have to select which of the 8 different furnaces that's possible in, haha.

jcranmer commented 6 years ago

Bob's mods is mostly extra complexity in scale. There's another major recipe group (the water/salt water electrolysis, as well as air filtering that collectively produce hydrogen, nitrogen, oxygen, and chlorine), which could end up agglomerated onto the oil recipe group (I don't recall the entire situation with sulfur). Other than that, it's pretty much an explosion of recipes (mostly 1 recipe for 1 item), and an explosion of different assembling machines.

Angel's mods on top of Bob's is a different story. The subset of the recipe matrix involving items with multiple recipes is no longer block diagonal in any meaningful way. It's also large--about 228 items by 730 recipes. Even reducing it to a block triangular form still gives the largest block as 132 items with 200-ish recipes (blocks here are those recipes that all produce the same items), consisting of all of refining, most of petrochem, some bioprocessing, and some smelting. This ignores barreling.

I've found that Angel's also depends on the cost function a lot more. Minimizing power consumption turns out to produce better results than minimizing resource use, since you can milk water into everything with pretty high gains, at the cost of craptons of machines.

KirkMcDonald commented 6 years ago

I've been analyzing the Bob's recipe graph, in the interests of improving the algorithm that extracts the subgraphs for which linear programs are necessary. (The algorithm currently in the calculator is a hack that works for the vanilla graph but is inadequate for the general case.) My as-yet-incomplete result for the Bob's graph (minus Angel's) looks like this:

bobs

As I improve the algorithm, it will likely end up pulling more and more of the graph into that one huge subgraph.

And that's probably fine. I recently conducted an experiment in which I had the calculator perform all of its calculations using one huge linear program covering the whole vanilla recipe graph. This took calculation time from about 30 ms to over a second. While that is considerably slower, it means that, even if I have to throw half the Bob's recipe graph into a single massive matrix, it shouldn't render the calculator utterly unusable.

As to the cost function, something will have to change there. With vanilla, there are very few alternative ways of making a given item. The only truly substantial alternative use of resources is the choice between ordinary oil processing, and coal liquefaction. This is not true of Bob's, which offers quite a number of alternative ways of making a number of different items.

The solution I'm leaning towards is adding a UI for allowing the user to adjust the list of resource priorities, e.g. to let the user choose between prioritizing lead over nickel or the reverse (or to treat them equally). The alternative is to provide an explicit option for each alternative recipe, e.g. to choose between making sulfuric acid from sulfur, vs. making it from sulfur dioxide. (This is complicated to describe, but my hope is that the UI will be intuitive enough that its function will be clear to see.)

I have yet to start looking at the Angel's-plus-Bob's recipe graph, but my hope is that, by making the code work for Bob's, it should work for any recipe graph I will throw at it after that.

jcranmer commented 6 years ago

Looking at your graph, the innermost blocks look like what I call recipe blocks while the outermost ones looks to be more like what you'd need for a block diagonal matrix (i.e., separating the simplex program you need to solve).

For Angel's, I spent some time working out how to shrink the matrix down before trying a better solver than scipy's simplex parser (which I now know, after the fact, to be absolute crap). When I went with the better solver, it happily ate up the the full recipe matrix (sans unbarrelling recipes, since those muck up the heuristics to sort the output), taking about 100ms even for the "do every science pack" recipe. As a bonus, the solver I'm using will also output within what bounds the cost function can be adjusted to give the same result. For the simple simplicity of the rest of the application, I gave up trying to generate smaller matrices, since 100ms isn't slow enough to matter (I have the advantage of not being limited to JS for implementation), and the most obvious compression method is already a fairly standard presolve heuristic.

Ober3550 commented 6 years ago

Personally I don't necessarily want a recipe graph but rather a simple 4 step solver.

  1. Use script to rip game recipes
  2. Upload script to solver
  3. Select recipes (maybe create a dummy recipe that includes 1 of each of the sciences
  4. Profit??? PS. I've been looking into this more recently because I've been modifying the Bobs Angels science recipes and I've wanted a way to calculate the total resource costs so that I can have an understanding of the balance of each of the different tiers of resources. I started working on a excel spreadsheet that basically uses a matrix to solve for 1 of each science but found it a bit impractical with almost 500 recipes to sort through. I've used Helmod to the best of its ability but things like Silicon Ingot which takes 6 Ingots + Silane Gas and outputs 24 Silicon Ingot just messes with Helmod. And don't get me started on the GUI and the fact that once Helmod reaches a certain point with the number of recipes its handling it decides to straight up just rearrange everything however which way it wants.
KirkMcDonald commented 6 years ago

This calculator is, of course, limited to using whichever predefined datasets I make available for it. My current work is aimed at adding support for Bob's Mods, and when that is done, I will likely add support for Bob's + Angel's, and may also add in a dataset for Seablock (although I have never played that mod).

This means that I cannot make everyone happy. There is an endless combination of mods available for Factorio, and adding a dataset for every possibility is not something I can practically do.

What I can do is streamline the process for generating datasets, so that if a user has some specific mod they would like to model in this calculator, I can minimize the hassle they would need to go through in order to run a local copy of the calculator with their custom data.

The current process for doing so is fairly involved, and I have not bothered to write it down anywhere. It involves installing Python, Lua, and some dependencies for each; dumping the Factorio game data to JSON using a particular Lua library; then using a Python script to convert that JSON file into a different JSON file while also creating the corresponding sprite sheet; and then finally editing a calculator source file to let it know about the new data file.

This process is arcane and annoying, and is not something I expect anyone to actually do. What would be possible is to create a single executable which would find the user's local installation of Factorio, then go through this entire process before starting a local HTTP server for the calculator and pointing the user's browser at it.

This would be a significant project, and I only literally just started thinking about it, so I wouldn't expect it to be done anytime soon. And, in any case, using datasets more complicated than the vanilla dataset will depend on the work I'm doing to support Bob's, so consider this idea blocked by this issue.

terite commented 6 years ago

While working on seablock support I made my own export process that's pretty similar

  1. run the factorio-data-dumper mod that exports game data as json to their factorio script-output directory
  2. process that file with a python script to make the sprite sheet (and eventually do localisation)
  3. edit settings.js to add a reference to the new data file

I've got it working pretty reliably, and was planning on putting up a PR once I verify that the data output is similar enough. Let me know if I shouldn't.

I also have a script that reads a savegame and downloads all relevant mods. I was thinking of turning that into a website that allows users to upload their savegame then be redirected to a version of the calculator specifically for that save's combination of mods. If that sounds good, I can link the repo when its up.

All this is a long winded way of saying that I really like this issues gets a big 👍 from me. I'd also like to see more of an itemized of known issues if you have one.

terite commented 5 years ago

@KirkMcDonald what are the current issues that you know of preventing this calculator from being a practical tool when using bobs mods?

I've implemented a proof-of-concept solution for taking a factorio save and generating an instanced calculator from it. I've noticed a number of issues, some I can probably fix easily:

Some I probably cannot fix easily:

But before I go too far far down this rabbit hole, is help something you want on this issue? And if so, what can we do to help?

edit: And if you're in the SF bay area I'd be happy to talk about it over a coffee or beer if you'd like

KirkMcDonald commented 5 years ago

The primary issue, which I recently had a breakthrough with, is replacing the algorithm that extracts the subgraphs for which a simple graph traversal cannot produce a solution; or, in other words, replacing essentially the entirety of subgraphs.js. It is this code which determines which recipes make up the matrices to which the simplex method may be applied.

The current version of that file is, as I've mentioned before, a hack that works in vanilla, but which will fail utterly in the face of more complex recipe graphs. Much of the calculator-related work I've been doing in the last several months has been analyzing the Bob's recipe graph in order to design an algorithm that works in the general case.

Parts of this problem are obvious. For example, it should probably have an explicit cycle detector. Other parts are more subtle: It is possible for one of these subgraphs to depend on another (e.g. the uranium matrix depends on the oil matrix, via the sulfuric acid used to mine uranium ore), but if that relationship is too complex, the graphs must be merged (e.g. solid fuel must be modeled as part of the oil matrix).

Defining what "too complex" means, algorithmically, is an interesting problem, and one which has been taking up a lot of my calculator-development time. In my notes, I have taken to referring to relationships like the solid fuel-to-oil relationship as "multivariate," meaning that you cannot reduce solid fuel to its raw resource costs in isolation. The most efficient overall solution will depend on any other consumers of oil products that happen to exist, and thus you must solve for solid fuel at the same time as those oil products.

I have a function, now, which can identify that sort of relationship, while distinguishing it from the uranium-sulfuric acid-oil sort of relationship, which has exactly one solution, and thus the two matrices may be solved separately. Expect an update to the code in the next few days.

I'll add that the algorithm I've come up with may, in certain edge-cases, combine matricies for which there is, strictly speaking, no need to do so. However, the results will still be "correct," in that it will still permit the calculator to work.

In any case, the Bob's recipe graph is one huge rat's nest, and the result of the algorithm is that a huge chunk of its recipe graph is merged into a single 203 recipe matrix. This means that running the calculator on Bob's will be slow, but it should work. Some new settings will need to be added to make it actually useful, but it'll be neat to see what the end result actually looks like.

KirkMcDonald commented 5 years ago

4998dd38d44fc1414c0ded2e8d283e4f34828b27 adds the new subgraph-detection algorithm. All I will guarantee right now is that it produces the same output as the old one when applied to the vanilla recipe graph. It is possible that it will produce useful results for other recipe graphs, but I haven't really tested it yet.

This was the single largest change required to support Bob's, but a number of smaller changes will be needed, as well. One necessary input when making a solution is the priority-order of the resources used as inputs to the linear programs. This order is currently hard-coded. At a minimum, this list will need to be expanded for other datasets. A better solution may be to expose this list to the user and to allow them to configure it.

Other options may be needed, as well. The vanilla option to choose between the different oil processing recipes is instructive: The Bob's graph adds many such choices, so it may make sense to add more options like this; or it may make sense to rely on the resource-priority order to make these decisions; or it may make sense to provide both sets of options. This is the sort of thing I'll need to play around with to see what works best.

For now I'm going to concentrate on #83, which is a relatively simple problem that can benefit immediately from this new algorithm.

KirkMcDonald commented 5 years ago

b9fea836973eaccd102a83693bc377bf51131538 adds experimental support for Bob's mods. The calculator is capable of producing results, but I'm not sure how actually useful it is without the additional settings that I would like to add. Here's a link with Bob's enabled.

As expected, it is very slow. Some more optimization may be possible, but there is an extent to which this may be unavoidable.

I haven't stress-tested it too much yet, but this serves as a proof of concept.

terite commented 5 years ago

I dug into the slow parts. For 100 factories producing space science packs, the first render recipeTable.displaySolution(totals) takes over 9 seconds for me, while solver.solve takes around 1.4 seconds after my patch in #105.

Displaying after a minor update (e.g. changing something to use an effectivity module) only takes around 100ms, though the solver takes about the same amount of time.

KirkMcDonald commented 5 years ago

I'd imagine a substantial fraction of that render time comes from rendering all those module icons. It might take a substantial rewrite of how the dropdowns work, but it would be possible to do this rendering lazily, the first time a given dropdown is opened.