minetest-mods / craftguide

:book: The most comprehensive Crafting Guide on Minetest
Other
43 stars 20 forks source link

Craftguide data caching is not persistent #98

Closed neoh4x0r closed 4 years ago

neoh4x0r commented 4 years ago

The caching code really needs to be updated so that the cache maintains persistence between sessions.

After the cache is initialized (the first time) - only added/changed/deleted entries need to be updated thereafter (rather than re-creating it from scratch each time).

kilbith commented 4 years ago

The only way to check that entries have changed is to lookup through all items... and you end up re-creating a cache.

orbea commented 4 years ago

@kilbith Maybe it could cache the lua file sums and only recache items when those files have changed? Otherwise is there much room for optimizing the process? Its slow starting minetest with many mods/items.

kilbith commented 4 years ago

Absolutely not, two files can have the same exact size with different contents. Caching is the cost for high in-game efficiency. Just accept it.

orbea commented 4 years ago

I do not mean checking file names only, but rather the file name and file sum, if either has changed invalidate the cache.

I don't think they can have the same sha512sum with different contents, maybe sha256sum, md5sum or even crc32 would be enough too?

kilbith commented 4 years ago

To compare the current data with the previous data, you would need to re-build a cache again. That doesn't solve your issue.

orbea commented 4 years ago

I was thinking in the best case no file names or file hashes have changed and it would only need to iterate over the files and not every recipe in every file. In other cases it may find that only one file was changed/added/removed and it would only invalidate and regenerate that part of the cache. So it would iterate over every file and then some recipes in the changed files instead of every recipe.

I think this is easier said than accomplished, but hypothetically it could greatly speed up the caching on subsequent runs unless every single file has changed in some way.

kilbith commented 4 years ago

Yes I get it but that's not my point: in order to produce a file to compare it with the older, you need to build a new cache. You are stuck with the same problem.

neoh4x0r commented 4 years ago

Yes I get it but that's not my point: in order to produce a file to compare it with the older, you need to build a new cache. You are stuck with the same problem.

The general way it would work is that a file is hashed and the hash is saved to a file -- this file would be a hash-table containing all the filenames as keys along with their corresponding hashes to ensure that look-ups are constant-time O(1).

Later those files would be hashed again, and compared to the stored hash-table, only if a hash changed would that file need to be re-hashed.

However, re-hashing the files would also take some time (probably less then rebuilding the entire cache each time), but will still take too long.

A better solution would be:

Furthermore, for the last point (when mods or enabled/disabled) the serialized cache could be updated (for those mods only) by indicating that those mods have been invalidated.

This would significantly reduce the slow down caused by rebuilding the cache every time (and it does not require the cache to re-built in-order to determine this).

kilbith commented 4 years ago

This will interest you: https://github.com/minetest-mods/craftguide/commit/6eb0955e5a2615bc78a77ec361271bc32c00c8a4

orbea commented 4 years ago

Thanks, but personally I think slow caching is preferable if it leads to faster run time. I only wondered if it was possible to improve the caching speed, but its not the end of the world if that's not possible.

neoh4x0r commented 4 years ago

@kilbith I guess that is the best that we can do for now (6eb0955e5a2615bc78a77ec361271bc32c00c8a4).

p-ouellette commented 4 years ago

I made the usages caching much faster in my fork. See https://github.com/pauloue/minetest_game/blob/craftguide/mods/mtg_craftguide/init.lua#L120. Tested with MTG+moreblocks (over 6k recipes), the total cache time is ~9 seconds before, 0.2 seconds after.

kilbith commented 4 years ago

@pauloue I'm looking for a new maintainer for this mod. Would you be interested?

p-ouellette commented 4 years ago

Thanks, but I'm not interested

grantcarthew commented 4 years ago

Hi, just catching up on this. Was noticing it took ages for my server to start and found out it was the craftguide. I understand why it happens however I have a suggestion!

How about a CLI script or tool that builds the cache outside of the server. Then when you add more mods or update a mod, you can run the cache build script. Then when you need to start the server it assumes you have built the cache and simply loads it off disk.

You could use the option to turn off auto cache in this case. Check for a cache file on disk, if found, load it.

kilbith commented 4 years ago

So I did some benchmarking with @pauloue fork vs. current HEAD, with MTG + moreblocks:

Current HEAD

init_items (old):   3208
[craftguide] loaded in 17.45s

@pauloue fork

init_items (new):   1884
[craftguide] loaded in 21.14s

Conclusion: not only the @pauloue fork handles less recipes, but the caching is actually slower!

orbea commented 4 years ago

I spent a bit of effort trying to replicate the claims of it being faster without any success, I'm glad to see I just didn't fail...

p-ouellette commented 4 years ago

You're probably not running the same code as I am because the number of init_items you got for my craftguide is way off. My craftguide is the mtg_craftguide mod here: https://github.com/pauloue/minetest_game/tree/craftguide. To time it:

diff --git a/mods/mtg_craftguide/init.lua b/mods/mtg_craftguide/init.lua
index fd372af..f321216 100644
--- a/mods/mtg_craftguide/init.lua
+++ b/mods/mtg_craftguide/init.lua
@@ -143,6 +143,7 @@ local function cache_usages(recipe)
 end

 minetest.register_on_mods_loaded(function()
+   local start = os.clock()
    for name, def in pairs(minetest.registered_items) do
        if show_item(def) then
            local recipes = get_craftable_recipes(name)
@@ -160,6 +161,7 @@ minetest.register_on_mods_loaded(function()
        end
    end
    table.sort(init_items)
+   print("[mtg_craftguide] time", os.clock() - start, "#init_items", #init_items)
 end)

 local function is_fuel(item)

Results, with both mods running in the same game, MTG+moreblocks:

[craftguide] time   14.918799   #init_items 3207
[mtg_craftguide] time   0.211686    #init_items 2985

The difference of ~200 init items is because my mod does not show items that only appear in a fuel recipe, like deafault:leaves. There are a couple of items that mtg_craftguide shows, but craftguide does not, e.g default:key and farming:desert_sand_soil. These items cannot be crafted, but do appear in a usage.

kilbith commented 4 years ago

You're probably not running the same code as I am

I copy-pasted your code and placed the timestamps at the exact same place. So I don't know what explains this.

default:key and farming:desert_sand_soil are not_in_creative_inventory = 1 so you aren't supposed to show them in the craftguide.

p-ouellette commented 4 years ago

Did you run mtg_craftguide or craftguide with some of the changes ported over? Note that I changed cache_usages to accept a recipe and add it to usages_cache[name] for each name that appears in the recipe. The old function took an item and searched over all recipes to get its usages. Also the way craftguide overrides minetest.register_craft and the differences in fuel recipes would need to be considered when integrating the changes.

kilbith commented 4 years ago

Yes I copied cache_usages and all functions related to it.

grantcarthew commented 4 years ago

What about an external tool to build the cache? It is taking 10min to start my server because of the crafting guide. I am on an older release though because my server is v5.0.1. Another mod is stopping me upping the server version.

kilbith commented 4 years ago

I'm not interested in making an external script, but PRs are welcome.

I am on an older release though because my server is v5.0.1

That is your problem.

neoh4x0r commented 4 years ago

I am on an older release though because my server is v5.0.1 Another mod is stopping me upping the server version.

That is your problem.

The craft guide will take some amount of time to cache the data....big-O of N (linear time) where N is the number of items that need to be cached. Provided that each recipe to be cached is only enumerated/touched once (otherwise it grows to n*logn, n^2, 2^n, or n!).

Notwithstanding, the server version is irrelevant because using a newer version won't significantly decrease the amount of data that needs to be cached (in fact it will most likely add more data that needs to be cached as new crafting features and etc are added).

What about an external tool to build the cache?

There are really only a few options:

1) Use the setting option to turn off caching 2) Disable the mod completely 3) use external tool: Do the caching in a way that does not require it to always be generated at runtime. 4) re-write the caching code so that it maintains and persists a per-world cache and can be generated on-demand.

For 3) that most likely means someone would need to write the tool to perform the caching outside of the game -- but that presents an entirely new problem since the mods can be global as well as per-world. This means the external tool must be able to detect what mods are installed (and whether they are enabled or not) most likely requiring the tool to be baked into the engine and would defeat the purpose of doing it outside of the game. Moreover, the engine doesn't need to be modified to cater to a mod-related issue.

And for 4) which sounds like the best option.

Provided that someone is inclined to modify the code to implement it. I mean once someone has installed a collection of mods, it is probably likely that the contents of those mods will not change from one session to the next (unless they have installed, removed, or enabled/disabled mods since the cache was built)--the assumption here is that at some point that a minetest server will eventually enter a stable-state where mods are rarely installed/removed, etc.

A user could turn on a flag for the mod to rebuild the cache (or force the build if it has not been created) such that subsequent runs will not rebuild it.

Really all this requires is to save and load the cached data to/from a file and add a settings option allowing the user to rebuild the cache.

grantcarthew commented 4 years ago

Great feedback @neoh4x0r thanks very much. I had not thought about the world.mt file and multiple worlds. That would cause issues for number 3 above.

I'm busy playing with deno at the moment and have other projects on the cooker so can't work on this myself. Sorry.

kilbith commented 4 years ago

Mod loading has been greatly optimized thanks to https://github.com/minetest-mods/craftguide/commit/e941443a5919583d7859099f64c2b2afacd3dfb3