Fluid Integration - Githubissues

shartte commented 8 years ago

Probably self explanatory.

There's also One probe, but I am not 100% certain on how that works.

Elix-x commented 8 years ago

Well, same goes for item stacks. 2 item stacks with different NBTs are not the same. And we store item stacks.

yueh commented 8 years ago

For exact matches yes. But items also have damage values and we have to query for every itemstack between 0% and 25% of the same type (same id + optionally same metadata). Or every item tagged by the oredict as being equal. There is nothing like a damage value or quality as shared attribute on every fluid stack. Or nothing like the oredict listing equal itemstacks/fluidstacks.

Elix-x commented 8 years ago

Well, absence of meta + precise dictionary is the only difference between fluid stacks and item stacks. So if we write system for fluid stacks, to make it support item stacks, we simply have to extend it and tell it about meta and precise dictionary.

yueh commented 8 years ago

Quick question, would you consider this a good design

class Rectangle {
    void setWidth(int) { ... }
    void setHeight(int) { ... }
}

class Quadrat extends Rectangle {
}

Elix-x commented 8 years ago

While Quadrat does not override anything from parrent, no. _{^{Quadrat <=> Square???}}

yueh commented 8 years ago

I was more thinking about the print/typography (quad) and not math version. But if you want a more complete hierarchy and math terms.

interface Shape {
   draw(Canvas);
}

class Circle implements Shapre {
   draw(Canvas) {}
   setRadius(Int) {}
}

class Rectangle implements Shapre {
   draw(Canvas) {}
   setHeight(Int) {}
   setWidth(Int) {}
   setSize(Int,Int) {setHeight(Int); setWidth(Int)}
}

class Square extends Rectangle {
   draw(Canvas) {}
   setHeight(Int) {}
   setWidth(Int) {setHeight(Int);}
   setSize(Int,Int) {setHeight(Int);}
}

But the question is really about the class hierarchy. Not whatever they implement or not.

Elix-x commented 8 years ago

Shape ---> Circle
       |-> Rectangle

part - yes.

Rectangle ---> Square/Quadrat

part - no (because in this case, rectangle can be made square without being a square).

yueh commented 8 years ago

How it looks is completely irrelevant. It is a question about the behaviour of the thing a class resembles.

Elix-x commented 8 years ago

I'll say it other way: separation of circle and rectanle - yes, separation of rectangle and square - no.

shartte commented 8 years ago

Okay. I haven't looked at this issue in a while. How the hell did you get from fluid integration to basics of inheritance and polymorphism :P

yueh commented 8 years ago

@shartte understanding of it is somewhat required

Not providing a square does not really solve or answer the problem, it just avoids it.

So how are rectangles, squares and let us add diamonds defined? (I avoid the part about defining them based in the diagonals)

Shared
- All are perpendicular quadrilaterals
Rectangle
- opposite sides must have the same length => 2 pairs of 2 equal long sides
- Changing the height or width does not affect eachother
Square
- All four sides are equal.
- Changing the height or width must affect the other one.
Diamond
- Square rotated by 45 degree (The diagonals are axis aligned)

So what is the problem of letting square extend rectangle or provide a rectangle as substitute for a square? It is pretty easy. square extend rectangle allows a square to act as rectangle, but any program would expect that changing the width does not change the height of a rectangle. Similar for using rectangle as square, it does not provide the option to change all four sides at once, thus it would no longer be a square after changing it. Both have a different behaviour and cannot substitute eachother.

Similar for a diamond, a user of the lib will certainly not be forced to calculate sqrt(a^2 + b^2) when setting the sides of a diamond. You will usually want to change the diagonals of a diamond (the actual height/width). If you now use diamond extends square and someone uses a diamond like a square, it is suddenly sqrt(a^2 + b^2) times larger than expected.

If you want to read more about it, search for liskov substitution principle Of course this depends on the actual case. Like making everything immutable is also a possible solution, because you avoid the whole changing a square might not longer make it a square. Simply not possible.

And about the let us just provide Rectangle and Circle (for a rasterizer) Why stop here? Just provide a generic Polygon, it can be used for every shape out there. Lines, triangle, trapezium, rectangle, square, oblong, circle and what not. Just pass a list of points to it. You want to know, if a polygon is a square? Sure just fetch the list of points, calculate the length of all 4 sides definied by the order of points and check if they are equal as well as if the diagonals between the even indexed points and odd are equal. Sure that is easy, but wait it might still be a diamond and you only want real squares. so just take the 4 points, if you have 2 pairs where different y coordinates have the same x coordinates. ((0,0),(1,0)) and ((0,1),(1,1)). If yes it is a square, if not it is a diamond. Certainly easier than shapes.filter( shape -> shape instanceof Square ).

Of course there is also this stupid circle, but you can just calcuate the distance of each point to the center and if all are equal within a delta it is a circle. Oh wait, a polygon with enough sides behaves the same. But you already know that a polygon representing a circle is still a polygon and not a circle.

Elix-x commented 8 years ago

Not providing a square does not really solve or answer the problem, it just avoids it.

Ok. Now, back to original problem... To which one of original problems?

yueh commented 8 years ago

Sticking fluids and items into the same thing. Which means you have a common supertype.

This means at someone can take a collection of itemstacks and use it as collection of stacks, which allows to put fluid and item stacks into the same collection, but it is not allowed to crash. Or anything else must be able to deal it. An export bus getting pushed a list with items and fluids? It needs to handle it. Some addon subclasses it for something new? The same.

Further both require additional things. Like items need to have total order, fluids not.

Elix-x commented 8 years ago

Alright, i quickly^tm wrote storage & crafting structure. Tell me if it works (hint - it does). https://gist.github.com/elix-x/64ed4b14ee3223b7f3dcfac2b9a49579

PS: I used groovy to write it, in order to shorten code and test it easily. But it should be very easy to port it to pure java.

yueh commented 8 years ago

It is pretty much what we already have, just without the important features, missing required data, being potentially too slow and being a memory hog. It even "uses" the same security to avoid someone from actually using at intended by the class hierarchy.

Edit.

We need at least the following

AbstractCollection
- merge stacks together,
- no setAmount(), someone will mess it up and forget that you need to always call collection.setAmountStored(t, collection.getAmountStored(t) + t.getStackSize()) when adding something to the list.
- support for decorators. we need to transparently hide specific entries.
- It needs to selfheal and remove itemstacks once the stacksize is <= 0, but only when not craftable or not requestable or whatever else still makes them valid.
- They will encounter a high amount of removals and are not really covered by GC, so they can actually be a memory hog.
- This is actually really stupid because either we have to use concurrent collections as this can easily happen while something is iterating over them and thus throwing a CME. The alternative would be to do it once per tick per collection and for example have everthing managed by a single factory doing an explicit cleanup. But also meh.
- (Time) Complexity must be as low as possible for every operation. So store/fetch should ideally be O(1)
- Fluids can even use an IdentityHashMap due to being singletons and being actually a bit faster than a normal HashMap
- Items will certainly have to use a tree based structure, so most likely O(log n) for put/contains/remove/get but this is outweighted by the improvement for range queries.
  - get or contains could actually be improved to O(1) by using two internal collections, but insert/remove will still be O(log n) and the additional cost of having to synchronize 2 collections perfectly.
- We cannot use normal NBT at all
- We can neither use the NBT when an stack enters the collection neither when it is exported to a vanilla ItemStack
- Doing so will leave the internal data exposed to anyone having access to the reference and thus can change every NBT tag of the stored items
- We need to frequently clone itemstacks to prevent anything from changing the stored itemstack. The problem here is that some mod devs are stupid and put a couple of megabytes into their NBT tags. AE has not just one collection, but easily a few dozen if not hundreds of them for different use cases (cells, network, priority based network, subnetworks). So copying the NBT each time is really bad and can easily bloat that single item with 2 MB data to 200 MB or more.
FluidStack
- fluid
- amount
- nbt
FluidCollection
- fetchOne by fluid
ItemStack
- id
- metadata
- damage
- nbt
- stacksize > 2^31
- isCraftable
- isRequestable
ItemCollection
- fetchOne by id & meta & damage & nbt
- fetchAll by (id & meta & (N < damage < M))
- This must be the best possible complexity so O(1) and that is possible.
- Higher order functions are actually a no go, they are at least O(n)
- fetchAll by ((id & meta) | (sameOre)) (this could also require same damage and nbt, currently not certain about it)
- This must be the best possible complexity so O(1), but likely not possible. Most likely O(#OreDictEntires).

Elix-x commented 8 years ago

It is pretty much what we already have, just without the important features, missing required data, being potentially too slow and being a memory hog. It even "uses" the same security to avoid someone from actually using at intended by the class hierarchy.

I did not optimise it or write in a way to be fast and performant. The goal was to show that's it's possible to use same class for fluid and item storage. This system also allows for flagging of elements (to ignore meta / nbt or use oredictionary). Also, you should have noticed that network is running in separate thread, and that's what we eventually want to do.

Separating amount from actual data ans using custom handlers can lead to huge advantage (or we would need to clone item stack with new amount value each time we change it, also run custom equals without taking amount into account).

I'll optimize it with some things of what you said and crash test it for speeds.

yueh commented 8 years ago

I know that it is possible, as it just how it is now. It is just that this one size fits all approach is really subpar.

What I listed are just the essential parts to rewrite it without any improvement or in other words a waste of time.

For real improvements we need at least the following features

Being thread safe
- That is more or less already the case due to having to use concurrent collections
- Not really necessary for the fluid collections currently
- More or less just needed for 1 or 2 cases, in pretty much any other case the additional costs are wasteful
- This is really difficult as we also have to be atomic towards saving to disk, there can never be changes to the list during it.
- Save to disk might not be atomic at all once dimensions are multithreaded
- Thus a crash might save an invalid state and destroy or duplicate data
Support for snapshots
- We need to create a snapshot at any point in time
- As in no changes after the snapshot was taken
- Creating a full copy is not sufficient, that is currently done and a major performance issue
- Copy on write approach or similar
- Needs to scale very well, the snapshots can be needed for many minutes and during it, the original collection can easily encounter many thousand changes
- Might be used to achieve some way of being atomic.
- Create a snapshot of every collection before being saved
Support for Transactions
- We need to be able to write the changes done to a snapshot back to the original collection at some point
- This even goes further than normal transaction, where a commit never fails
- It actually has to fail a commit, if it would put any stack into an invalid state (stacksize < 0)
- It is pretty much a 2 step commit. First it has to apply the preconditions and after some time the postconditions
Support for Deltas
- AE is completely done server side, no synchronisation to the client besides block states
- Opening a terminal will send a new list of the network inventory to the player, with deltas for inventory changes
- Ideally it would use the snapshots to calculate the delta and send it to the client once a tick
- Client would need to be able to apply the whole delta
Probably stuff I have already forgotten.

This is the feature set a one size fits all at least has to provide.

The problem, these features are only needed in a few cases. Most common ones are actually

Cell Partitions
- Constant collection
- Just contains
- Fuzzy support for damage values
- Oredict Entries can be baked into the collection at creation
Cell Storage
- Very basic feature set
- Precise Insert and Exctract
- copy all
- Nothing else really
- No fuzzy, no oredict
- Array would probably be enough would it not have O(n). Thus HashSet is pretty perfect.
Clientside collections
- They even to not need the fancy part about damage, nbt and what not
- Just a basic collection of generic stacks with (name, icon/model, stacksize, is craftableOrRequestable).
- Should be the only one ever needing to support deltas
Network inventory
- First higher level actually needing full fuzzy support
- View of all cells or other network inventories attached to the current network
- Mostly a cache for contains or client access
- Item handling for insert/extract is actually done differently as it needs to support priorities.
Crafting network
- This is pretty much the only thing needing full support for snapshots, transactions, fuzzy, etc
- Snapshots/Deltas
- before starting the simulation, so changes will not interfere with it
- collect a delta about the items required to extract for crafting the missing ones
- collect a delta about the finished product and intermediate products
  - Overflow, ex. request 1 jack o lantern, only pumpkin, log and coal available => 1 jack o lantern + 2 planks + 3 torches + 3 sticks as final result
- Transaction
- Extract the simulation delta from the network inventory
  - Hard Fail, should the network no longer contain the required items
- Insert the result delta into the network inventory
  - Soft Fail, when not enough storage is available, store what is possible, block until everything can be stored

Elix-x commented 8 years ago

Being thread safe

What i have there is partially thread safe. We can sync IO and net thread for saving (terminate()), but once dimensions will be multithreaded, IO and blocking will become extremely complex.

Crafting network

In what you said, there's one problem. No future expectation/calculation. If crafting operation tree is executed as one task, the operation will simply fail if some items are missing during launch. Even though it takes time, during which player (or anything else) can supply items to the system. It can then either fail (missing items), or wait and snapshot system each update and recalculate entire tree (bye-bye performance). But if we split crafting tree task into crafting subtasks (one per same crafting), it will wait until items become avaible. In this case, snapshots are not quite needed.

Clientside

Never do any logic on clientside, neither it needs entire system state. When client opens GUI displaying some kind of network content, we only send what he sees. That's it.

Snapshots, transactions & deltas.

What for exactly is each one needed. They should not be required, as there's simplier solution (see clientside).

yueh commented 8 years ago

Crafting network

It is already using subtask for every step. It basically takes a snapshot of the current network inventory, hands it of to another thread which calculates the required items. The inventory is used to shortcut calculations, if a material is already present no need to calculate every of the 8192 potentially paths it can be crafted. If this succeeds it hands of the tree to a crafting cpu to be executed.

The simulation beforehand to list what items are required is really necessary as feature, but to some extend is optional. It is possible to mark certain items as being infinitly requestable, say through emiting a redstone signal to turn on a cobble gen. So the crafting cpus are already waiting on items being supplied at some point. But there is absolutely no indication for why it stuck. Does it lack the items? Or is that machine by design taking 4 hours to process 1 iron ingot to a steel ingot? Or is the item itself stuck somewhere?

Requiring the materials to exist at least allows a player to not have to calculate by hand, if there is 1 iron ingot missing. If the crafting cpus are waiting it is either the machine being slow or the item got stuck somewhere.

The alternative is the old AE1 way, which frequently got stuck because the player removed materials required for crafting, the network itself bugging out for no apparent reasons, the usual chunkloading problems with a multiblock spawning over dozens of chunks. And the "can it check, if I have enough materials" being a pretty common request from players.

It also allows the system to work faster, it has the items at hand and can immediately dump it to crafting or processing without having to wait on the current step and if there are still enough materials after it to start the next one. Because during this time a user could remove every needed item from the network.

Clientside

Clientside needs to perform some logic. There is no way around it. Like you cannot sort or filter by name, if there is no name serverside. Or why even lock up the server for some clientside rendering?

Snapshots, transactions & deltas.

The current snapshot for crafting calculations is a major performance issue. As you said recaculate the tree is even worse and the crafting calculation can already take hours in worst case situations. So having a snapshot option and threadsafe collection would eliminate the problem of copying the whole inventory for the crafting simulation. It can simply use a fixed point in time copy of it, check if there are enough items, extract them without affecting the original inventory, but still have an updated collection for further checks and afterwards apply the simulated extract to the real inventory once the crafting starts. The snapshot might be superfluous and could simply replaced by layered deltas so changes to the inventory can at least affect future steps.

Most simulations should finish in a few ticks, so it would be unlikely that suddenly thousands of items are missing but for the ones taking some seconds or more it could affect them.

Elix-x commented 8 years ago

Clientside needs to perform some logic. There is no way around it. Like you cannot sort or filter by name, if there is no name serverside. Or why even lock up the server for some clientside rendering?

Beacause containers (because server is the one who is responsible for logic and if we will use a GUI without container on server, it will open large door for hacks). Containers have slots and only visible slots are displayed on client? So why to send entire network to client, if it sees in gui only 27 items (or 69 if he uses large GUI)? Why wouldn't sending only 27 items player currently sees to client? These will be sent anyway, even if we decide to send all the system, because of how containers and slots work (in vanilla, but we do not want to write our own safe gui system).

Crafting network

The display of required / consumed items can be implemented as separate task which will snapshot system. No need to do that for actual crafting. Also, did i say, we need a error handling mechanism. Cause of threading and parallel crafting & player interaction there will be moments where player takes out times which will be needed for crafting later on. So something like "Error Terminal" or "State Terminal" which will display errors (like "Not Enough Space to import from ..." or "Crafting component not found: iron ingot") may work.

Snapshots, transactions & deltas.

Got what snapshots are for. Can be made to only be used to display info to user. What about transactions & deltas?

yueh commented 8 years ago

Beacause containers (because server is the one who is responsible for logic and if we will use a GUI without container on server, it will open large door for hacks). Containers have slots and only visible slots are displayed on client? So why to send entire network to client, if it sees in gui only 27 items (or 69 if he uses large GUI)? Why wouldn't sending only 27 items player currently sees to client? These will be sent anyway, even if we decide to send all the system, because of how containers and slots work (in vanilla, but we do not want to write our own safe gui system).

Because the server does not know which 27 items the client sees. And scrolling would scale with network latency. Btw do you assume nothing existing in AE2, what makes it already work?

Got what snapshots are for. Can be made to only be used to display info to user. What about transactions & deltas?

Read it again. I sounds harsh, but I really do not have time to repeat myself each time. If you did not understand something or I explained it poorly, please ask exactly what it was, because it makes it so much easier to give a better answer.

Note I make these points up because the current system has issues regarding performance. Some cannot be solved, because minecraft just sucks with inventory handling (or the lack of it). As already said your solution is pretty much exactly what we already have and it is one of the issues due to the "one size fits all" approach preventing better solutions for certain aspects to be used.

Elix-x commented 8 years ago

Btw do you assume nothing existing in AE2, what makes it already work?

Well, thought nothing changed in container code, it does not work anymore.

please ask exactly what it was

> - Support for snapshots > - We need to create a snapshot at any point in time > - As in no changes after the snapshot was taken > - Creating a full copy is not sufficient, that is currently done and a major performance issue > - Copy on write approach or similar > - Needs to scale very well, the snapshots can be needed for many minutes and during it, the original collection can easily encounter many thousand changes > - Might be used to achieve some way of being atomic. > - Create a snapshot of every collection before being saved > - Support for Transactions > - We need to be able to write the changes done to a snapshot back to the original collection at some point > - This even goes further than normal transaction, where a commit never fails > - It actually has to fail a commit, if it would put any stack into an invalid state (stacksize < 0) > - It is pretty much a 2 step commit. First it has to apply the preconditions and after some time the postconditions > - Support for Deltas > - AE is completely done server side, no synchronisation to the client besides block states > - Opening a terminal will send a new list of the network inventory to the player, with deltas for inventory changes > - Ideally it would use the snapshots to calculate the delta and send it to the client once a tick > - Client would need to be able to apply the whole delta Support for them - maybe, but why, what for? Which tasks need them?

yueh commented 8 years ago

Well, thought nothing changed in container code, it does not work anymore.

Hm. Have not really followed this part. But essentially we need to port the old code, which had special AESlots for these cases. There is simply no way around it.

Support for them - maybe, but why, what for? Which tasks need them? Autocrafting and/or caching, pretty much everthing complicated is related to autocrafting or caching.

Deltas

Deltas would be mostly nice for bulk updates, which are currently not really possible but one of the performance issues. One issue is that we have to fully rebuild the caches with every single change (that is a minecraft problem). And there can be many caches, all needing a rebuild. With deltas we could at least restrict the full rebuild to the actual affected cache and then just propagate the delta between before and after the rebuild.

One of the issues is that we currently have two completely different system being handled by the same. We have the basic IItemHandler behaviour as hard truth for any inventory manipulation with and that i pretty much hit or miss, either it can extract/insert an item or not during modulate. It also has a simulate path, but that is not binding at all, it still can return false results.

Then on top is the whole network inventory, which just acts as a cache and is the only source for similar items. It just provides the caching layer over all cells, storage buses and whatever inventory else and which of them match a specific search context. Same oredict, items within a certain damage range and so on. Each entry on that list is then used on its own to extract it through the itemhandlers.

The problem is then that the caching layer tries to deduce changes happening to the actual itemhandlers, instead of them propagating an event. This causes a network to send a "1 cobble stored in the network" event, even if that cobble is forwarded to another network, which also sends the same event again, which reaches the first network, because that needs to be informed about changes should the item not enter through the first network. So just chain some networks together and the first one can easily recieve 10 or more events about a cobble being added. So no chance to apply this as delta. Instead it has to rebuilt the whole cache. Once per event...

This is something which should really be splitted into two different systems, like a basic ItemHandler for extract/insert just like the IItemHandlers in forge and then a separate caching layer. Further the caching layer should no longer make assumptions about what happens but just listen to events posted by an ItemHandler and then apply it to the cache (ideally as delta, not rebuild)

This might also help with multithreading. Worst case the caches could be thread local and waste some memory. If the cache is actually thread safe, it can just apply a delta once an import bus actually ticked and send the event about 1 cobble being imported. Should we then ever have to uses queues for the normal item handling, there should be no changed needed for the caching layer.

Transactions These would be really helpful for autocrafting. During the simulation the process has to actually extract the needed items from the inventory and not just simulate it (that just says it is there, but does not flag it as used).

Say your inventory has 1 log and you want to craft a wooden pickaxe. The crafting traveres the pattern of a wooden pickaxe either until it finds the required item inside the network or follows the next pattern.

Imagine it creates the tree like 1 wooden pickaxe => (3 planks) + (2 sticks) => (1 log) + (2 sticks) => (1 log) + (2 planks). At this point it finds the 1 log inside the inventory and extracts it. So now we have 0 logs. This of course returns 4 planks and not just 3 as requested, so 1 plank goes back into the network for the next step. After this it then notices it has only 1 plank, but needs 2 for the next step so it fails.

The issue then is, would this be using the real network, it would just have destroyed 1 log and inserted 1 plank. So it should never happen. Therefore it currently creates a full copy of the inventory, performance wise the worst option. With transaction it can simply use the same inventory, no need to copy it. Also changes to the network will still be visible locally so you can use it easily to determine missing items. Should everything be available commit, else rollback (or actually just throw the simulation away). The snapshot would just reduce the complixity as you do not have to account for changes on the actual inventory.

It could be achieved by using multiple local connections. Like keep track of what was extracted locally and subtract this value each time it accesses extractItem. So you would have to at least decorate the original inventory to block direct write access to it. (Which is what transactions would be)

Edit. This is really something which should be solved before even thinking about adding a fluid integration and making it just worse.

Elix-x commented 8 years ago

Deltas

Ok. Got it. Thought, there's a way around it: instead of constantly writing and reading to/from cells, we only write data when required (so cell is kept at state it was originally until it's saved, or somethign takes it out). This should drastically reduce access to world and save us from some multithreading issues. One thing it won't work with is alt+F4, but if you do that in world even in vanilla mc, it will go corrupted anyways.

Transactions

Got it. Feeling a better way around it too, will think about their implementation and possible ways around.

Hm. Have not really followed this part. But essentially we need to port the old code, which had special AESlots for these cases. There is simply no way around it. This is really something which should be solved before even thinking about adding a fluid integration and making it just worse.

We will deal with new stuff, containers & client sync once we decide basic network infrastructure.

yueh commented 8 years ago

Ok. Got it. Thought, there's a way around it: instead of constantly writing and reading to/from cells, we only write data when required (so cell is kept at state it was originally until it's saved, or somethign takes it out). This should drastically reduce access to world and save us from some multithreading issues. One thing it won't work with is alt+F4, but if you do that in world even in vanilla mc, it will go corrupted anyways.

No cells are fine, they are only serialized when need. The are pretty much the smallest issues as they pretty much work like a slot based inventory. The only thing we could optimize is throw away the current itemlist, because it has to be backed by a NavigableMap. But cells really just need an Array[63] or HashSet or maybe even a IdentityHashMap if we ensure that ItemDefinition are always a singleton. Otherwise we still would only deal with O(log 63)

The problem are really the caches which have to constantly updated. Single cobblestone extracted from 20 chained networks? 20 network caches needs to be updated, because that is the first source anything is looking for available items before actually extracting them (and marking stuff to be saved). There is also no "but we could accumulate them and just do one update". The next export bus ticking needs to know that the one before just exported the last stack of cobble and nothing is left. Contains on the cache is pretty much O(log n) with n being the numberOfStoredItems. Checking cells/storagebues/networks? Probably O(m log n) with n the same and m numberOfCellsAndNetworksAndStorageBusesAndWhateverElseYouCanImagine

Transactions

It is really the one size fits all making it complicated because it has to support everything. Instead of using the right tool for the job. Like we also need a decorated itemlist, which hides every craftable item from the list to not propagate this information over multiple networks.

We will deal with new stuff, containers & client sync once we decide basic network infrastructure.

Just as headsup, this most certainly needs some very custom implementation. Minecraft is not designed to sync a list with 2000 items, their data including the broken ones with 2MB NBT data at once. But it needs to happen. AESlots are basically fake slots, which have to send a packet to the server requesting that item and then attaching the return to the cursor or place it into the inventory. Normal stacks do not even like stacksizes > 64.

Why sync everything to the client? There is this thing called translations and @ClientSide. So no sort by name on the server, that has to happen on the clientside. If we keep it, there is even an inventory tweaks integration to use a customisable, but clientside sorting order.

Elix-x commented 8 years ago

Ran some tests on performance impact (faaar not all), here are the details

IO

Test is run 1000 times, each time, time taken is remembered. It is averaged and divided by number of operations in the end to give average time/operation. Time is in ms.

Store

Stacks of 64 for 3 items for 512 different metadatas were stored in the system (1536 stacks total).

Structure Type	Average Time/Operation	Time Total
Array[HashMap[StackHolder,int]]	43.850E-3	67354
Array[LinkedHashMap[StackHolder,int]]	42.643E-3	65500
Array[TreeMap[StackHolder,int]]	5.662E-3	8698
HashMap[StackHolder,int]	42.004E-3	64518
LinkedHashMap[StackHolder,int]]	3.715E-3	5707
TreeMap[StackHolder,int]	6.025E-3	9255

Craft

Stack of 64 torches is crafted with oredict aliasing (logs -> planks -> sticks -> torch).

Structure Type	Average Time/Operation	Time Total
Array[HashMap[StackHolder,int]]	9.85E-1	985
Array[LinkedHashMap[StackHolder,int]]	1.156	1156
Array[TreeMap[StackHolder,int]]	∞	∞

yueh commented 8 years ago

The times for the hashmap are seriously off. Either something is very wrong about the test or groovy uses their own crappy version of a hashmap.

Further the test is pretty irrelevant. It just tests the worst case, which should never happen and it is completely ignoring the major uses.

Just forget crafting. That is simply not how it is done.

Elix-x commented 8 years ago

The times for the hashmap are seriously off. Either something is very wrong about the test or groovy uses their own crappy version of a hashmap.

No, tested in same conditions. Also, for each, i ran it a few times (3-5) to ensure that all results are ~ the same (and they were).

Further the test is pretty irrelevant. It just tests the worst case, which should never happen and it is completely ignoring the major uses.

Just forget crafting. That is simply not how it is done.

Ok. Will test crafting mechanics later and separetly.

So, which kind of data structures should i test next? Also, would be nice to have current AE structure tested, just for comparison.

yueh commented 8 years ago

No, tested in same conditions. Also, for each, i ran it a few times (3-5) to ensure that all results are ~ the same (and they were).

HashMap is usually one of the fastest generic maps. Finding a slightly faster one usually requires some specialised maps, but these do not provide any meaningful improvment. The only exception is, if the code sucks. Then a HashMap will perform similar to an array. Looking at the numbers, it is similar to an array.

Ok. Will test crafting mechanics later and separetly.

No not later. Completely forget it. It is like improving the flight performance of a car. Just useless.

So, which kind of data structures should i test next? Also, would be nice to have current AE structure tested, just for comparison.

There is no point in doing synthetic benchmarks for a mostly irrelvant use case with a clearly broken implementation fudging the results.

Elix-x commented 8 years ago

HashMap is usually one of the fastest generic maps. Finding a slightly faster one usually requires some specialised maps, but these do not provide any meaningful improvment. The only exception is, if the code sucks. Then a HashMap will perform similar to an array. Looking at the numbers, it is similar to an array.

Yeah, i don't know either what's up with that.

There is no point in doing synthetic benchmarks for a mostly irrelvant use case with a clearly broken implementation fudging the results.

Ok, what are we going for? What's our target average time / IO? What's current AE system average time / IO?

yueh commented 8 years ago

Yeah, i don't know either what's up with that.

I blame groovy. I actually have absolutely no knowledge about it, but it adds simply to many moving parts, who could behave differently than java.

Ok, what are we going for? What's our target average time / IO? What's current AE system average time / IO?

Average time is pointless. There is this thing called Big O notation and as long as you did not prove that your algorithm cannot be futher improvement in terms of the its class, there is no point in any other benchmark. You cannot beat moving from a linear runtime to a logarithmic one by reducing the average runtime by 5%.

AE has in most use cases O(log n) for inventory operations. Yes, some could be O(1) with a different approach, but they cost other common use cases to be at least O(n). But O(log n) in average is still way better than one half O(1) and the other O(n).

Also 1.5k items is way too less. Vanilla minecraft and AE2 have nearly 8k items together in 1.7.10. With 1.10 I would not be surprised, if this is about 9k or 10k. So you should at least target this amount or even more. With other mods it should easily be 50k items or more. The first case requires like 10-20 drives, which is really not difficult to get.

Elix-x commented 8 years ago

I blame groovy. I actually have absolutely no knowledge about it, but it adds simply to many moving parts, who could behave differently than java.

It is based on java, and all it does to java is adding new syntax. Also, all method calls go via reflection (unless class is @StaticCompile, which will compile direct access bytecode), so results here can be at least divided by 8 to see what ~ will java do.

Just looked it up: while groovy does modifications to Map interface, it does not do anything with HashMap.

Average time is pointless. There is this thing called Big O notation and as long as you did not prove that your algorithm cannot be futher improvement in terms of the its class, there is no point in any other benchmark. You cannot beat moving from a linear runtime to a logarithmic one by reducing the average runtime by 5%.

AE has in most use cases O(log n) for inventory operations. Yes, some could be O(1) with a different approach, but they cost other common use cases to be at least O(n). But O(log n) in average is still way better than one half O(1) and the other O(n).

Yes, know about big O. And we're trying to make all I/O operations O(log n), O(1) if possible without making other operations be O(n). Correct?

yueh commented 8 years ago

O(1) is always the goal. O(n) might be fine, if this is just a read access and could easily be cached without too much memory usage. But that is really a case by case decision.

Currently there are simply nothing left to improve with the already existing collection itself. The problems are mostly about the needed self cleaning, way to many rebuilds of the collection and that fluids and items share the same approach, while we are doing everything to avoid it manually instead of just letting the compiler handle it.

The Map<Stack,int> is also particularly bad. To add or update record it has do to:

if(map.containsKey(stack)) {
  entry = map.get(stack)
  map.put(stack, entry + stack.size)
} else (
  map.put(stack, stack.size)
)

contains, get and put would on average be O(log n) and worst case O(n). While a better map can easily do an update in 1 map operation and add in 2 instead of 3 for update and 2 for put. Especially as update is way more common. Being able to reduce that by 66% is a pretty good after the collection itself is no longer improvable.

Elix-x commented 8 years ago

Yes, but if we use collection, we need to add to our handler amount field which in't taken into account by hashCode and equals.

Concerning code: -First -33%:

Integer i = map.get(stack)
if(i == null) map.put(stack, stack.size)
else map.put(stack, i + stack.size)

-Second -33%: pure int can easily be replaced with a mutable wrapper to avoid putting it back.

yueh commented 8 years ago

Regardless, this is just wasting time. You try pretty much to just reimplement the current code, potentially adding more performance issues but not fixing any of the important ones.

Elix-x commented 8 years ago

Regardless, this is just wasting time. You try pretty much to just reimplement the current code, potentially adding more performance issues but not fixing any of the important ones.

By important ones, you mean these?

The problem are really the caches which have to constantly updated.

> The problem is then that the caching layer tries to deduce changes happening to the actual itemhandlers, instead of them propagating an event. This causes a network to send a "1 cobble stored in the network" event, even if that cobble is forwarded to another network, which also sends the same event again, which reaches the first network, because that needs to be informed about changes should the item not enter through the first network. So just chain some networks together and the first one can easily recieve 10 or more events about a cobble being added. So no chance to apply this as delta. Instead it has to rebuilt the whole cache. Once per event... > The issue then is, would this be using the real network, it would just have destroyed 1 log and inserted 1 plank. So it should never happen. Therefore it currently creates a full copy of the inventory, performance wise the worst option. With transaction it can simply use the same inventory, no need to copy it. Also changes to the network will still be visible locally so you can use it easily to determine missing items. Should everything be available commit, else rollback (or actually just throw the simulation away). > The snapshot would just reduce the complixity as you do not have to account for changes on the actual inventory. Any more?

yueh commented 8 years ago

Mostly the cache invalidation, that we cannot split fluids and items to use the best approach for each, absolutely everything has currently to handle fluids and items, even when it just cares about one and so on.

Not the inventory collection is the issue, just how it is used and how we try to shove contrary things into the same structure. If you deal with O(m * n * log n) there is no point in making the log n part 5% faster before solving the O(m * n) issue

Elix-x commented 8 years ago

Mostly the cache invalidation, that we cannot split fluids and items to use the best approach for each, absolutely everything has currently to handle fluids and items, even when it just cares about one and so on.

When one should invalidate it and when should other? Which part of cache?

Not the inventory collection is the issue, just how it the are used and how we try to shove contrary things into the same structure. If you deal with O(m * n * log n) there is no point in making the log n part 5% faster before solving the O(m * n) issue

Agree, but where's m*n part (it's there if we keep Array[Map] structure, but we definetly not)?

Net access from task - O(1)
Getting type of thing to store / extract (getClass() / .class) - O(1)
Retrieving storage from type map - in worst case of worst map O(log n)
Retrieving amount of items from storage - in worst case of worst map O(log n)

So, O( 1 * 1 * log n * log n ) == O((log n)^2). [Which is still better than O(n).](http://www.google.com/search?q=y=log x, y=%28log x%29^2, y=x)

yueh commented 8 years ago

I would recommend reading the code and understanding it. Not just the stack collections. Everything inventory related. Otherwise this discussion will just be fruitless.

MeetPopSikAL commented 7 years ago

This all breaks my heart. I cannot be the only programmer to lurk through why AE2 is not running prod binaries on 1.10.2/x... ugh. I'd love to spend my time helping where a I can to get this project done, but it seems like this project will never finish. Why spend my time here? We need to ship a stable mod in the time that kids/people will play it... and have it in mod packs. I am pretty sure AE missed the new FTB pack. Why do we care so much about new stuff if we can't make a binary package for people to use?

This thread makes so much sense but this is never going to help anyone at all on the planet ever because the most amazing fluid interface is never going to work because the mod it depends on was never shipped as a real world tested compiled program.

dpeter99 commented 7 years ago

@MeetPopSikAL https://github.com/AppliedEnergistics/Applied-Energistics-2/tree/1.10

AEModernMCPort / Applied-Energistics-3-Fork

Fluid Integration #80

IO

Store

Craft