AlexIIL / SimplePipes

A small test pipes mod based around LibBlockAttributes
Mozilla Public License 2.0
32 stars 14 forks source link

Some random things infos & Suggestions #30

Open Speiger opened 4 years ago

Speiger commented 4 years ago

https://github.com/AlexIIL/SimplePipes/blob/0.2.x-1.14.x/src/main/java/alexiil/mc/mod/pipes/blocks/BlockPipe.java#L134 Last Parameter is "isMoving" which is used in removeBlock (forge mappings) Edit: isMoving stands for if it is pushed by a piston right now.

https://github.com/AlexIIL/SimplePipes/blob/0.2.x-1.14.x/src/main/java/alexiil/mc/mod/pipes/blocks/TileTrigger.java#L23 Suggestion: Remove the TileEntity from the client side ticking list that way this check is not needed.

https://github.com/AlexIIL/SimplePipes/blob/0.2.x-1.14.x/src/main/java/alexiil/mc/mod/pipes/blocks/TileTrigger.java#L26 Doesn't TileEntity have a cache of its own BlockState? (Performance improvement)

https://github.com/AlexIIL/SimplePipes/blob/0.2.x-1.14.x/src/main/java/alexiil/mc/mod/pipes/blocks/PipeFlowItem.java#L160 Since bc was your inspiration why do you move back to a compoundNBT? You have so much networking overhead with that one, not because its a compoundNBT but because of the "keyStrings". If you want to keep the compound my suggestion would be: Channel Everything into a byteBuf (or ByteBuffer) and then compress it and then put the resulting byteArray into a compoundNBT that way you have the least overhead and even have compression in there. (Networking is the biggest issue with this kind of mod)

Yeah I should stop nitpicking at all the things. I just started looking into this because i am interested in transportation in general (Sponging knowledge)

AlexIIL commented 4 years ago

Unlike BuildCraft (which I put a lot more effort into) I don't really care about the performance of SimplePipes - it's a quick copy-paste of buildcraft to check to see how viable porting BC itself to fabric is. (And as such I don't want to put much time into optimising it, as I'd rather spend that time porting or optimising buildcraft itself).

However as some things you've raised apply to buildcraft too (including #29) I see no reason not to at least answer you (even though it's very unlikely that I'll make any of these changes to SimplePipes.

Last Parameter is "isMoving" which is used in removeBlock (forge mappings)

I was going off 1.14 yarn mappings - which didn't used to have parameter names, however I did pick up on this during my 1.15 port as they've added param names to yarn (even though I haven't pushed it yet... I will soon!)

Suggestion: Remove the TileEntity from the client side ticking list that way this check is not needed.

I don't disagree with that... however:

  1. I don't want to touch minecraft objects more than I have to - as it might make future ports more complicated, and not work well with other mods that
  2. As the BlockEntity isn't needed on the client at all it might make more sense to just return null in BlockTrigger.createBlockEntity(view) `return (view instanceof World && ((World) view).isClient()) ? null : new TileTrigger();`

Doesn't TileEntity have a cache of its own BlockState? (Performance improvement)

Yes! And I can't think of any reason not to use it. (Except for a possible worry that it might not get reset correctly all the time, but that seems unlikely as it's built into vanilla).

Since bc was your inspiration why do you move back to a compoundNBT?

At the time I hadn't written LibNetworkStack, which meant the quickest way to make it work (at all) was to re-use the BlockEntity.fromClientTag() networking systems, and that means it's simplest to just write everything to NBT. (And it makes debugging easier - so bugs are less likely).

Speiger commented 4 years ago

Ahhh good to know. And the points make some sense.

While I am at it, i found an interesting flaw in the BC/SimplePipes mod regarding its transfer system. The DelayList is a good CPU Performance gain. But it is Heavy on the Networking Side to the point where I think that even with optimizations its not viable enough in its current implementation. In my own tests I managed to get my References down to the range of 6-14 bytes. And Essential Updates down to 2 Bytes (Essentails are: AlreadyAttempted Directions(6 bits), CurrentDirection(3bits), Color(5bits), Inserting(1bit) which comes down to 15 bits), that includes ofc 1-5 bytes for the id. Resending the items every time they hit the center are inserted into another Pipe is in best case a 3x or worst case a 7x more Networking usage with the DelayList since you are resending the items every time. (Buildcraft in this case too) A potential solution is to maybe have a stored "AwaitingInstructionsItems" queue that keeps items alive for 2-5 game ticks (DelayedList might be useful with a Map for faster access) So you can send a Essecials info packet that just batches together all nesseray data to reuse the client item without resending all the data. If the item is no longer present you can request the full item from the server.

Also VarInt (1.12 i think it exists too) might be a good use, though you have to be careful since negative numbers have the 5 byte penalty while the first 21 bits are basically better or equal in networking usage.

I hope that makes some kind of sense.

AlexIIL commented 4 years ago

I think it's probably a good idea if I copy-paste what I'm doing for BuildCraft ATM, since simple pipes is basically completely unoptimised.

public void readTravellingItem(NetByteBuf buffer, IMsgReadCtx ctx) throws InvalidInputDataException {
 ctx.assertClientSide();
 int stackId = buffer.readVarInt();
 // The server sends the stack before sending this packet, if it hasn't already sent it to the client
 // Auto-removal isn't handled yet, but it could be added in the future.
 ItemStack stack = McNetworkStack.CACHE_ITEMS_WITHOUT_AMOUNT.getObj(ctx.getConnection(), stackId);
 // I'm not quite sure why this isn't "readFixedBits(6)" - I'll probably change this now
 // Although it could technically just be 1-5, as the client never actually displays the exact count, just rounded a bit for rendering.
 int count = buffer.readUnsignedShort();
 TravellingItem item = new TravellingItem(stack, count);
 // NetByteBuf compacts this (plus "readFixedBits" and "readEnumConstant") down to only the number of bits that you actually need... so this isn't as inefficient as it first appears (as netty by default writes a whole byte per boolean)
 item.toCenter = buffer.readBoolean();
 // This will only be 3 bits
 item.side = buffer.readEnumConstant(Direction.class);
 // This will only be 1 or 5 bits (depending on if the first bit is false (null) or true (for a colour)
 item.colour = MessageUtil.readEnumOrNull(buffer, DyeColor.class);

 // This is used for synchronising the client + server ticks together so that it always renders *perfectly*, without travelling items vanishing for a tick occasionally.
 // However this is probably going to always be the value that the server sent plus one, so it might not be necessary to even send this
 item.tickStarted = buffer.readVarLong();
 // Again, quite a large range, and it probably doesn't have to be this big.
 item.timeToDest = buffer.readUnsignedShort();
 item.tickFinished = item.tickStarted + item.timeToDest;

 int nextTick = (int) (item.tickFinished - lastTick);
 items.add(nextTick - 1, item);
 // Some debug logging left out.
}

On top of that each packet will write out:

However that is usually optimised-out because libnetworkstack buffers a lot of packets together until the end of the tick.

All of these taken together add a fair bit of overhead (8-13 bytes). However that's not to say it's not useful to optimise the pipe networking itself.

Anyway, on to the response:

AlreadyAttempted Directions(6 bits)

The client doesn't use this, so it's not sent.

The DelayList is a good CPU Performance gain. But it is Heavy on the Networking Side to the point

In old BC the server only send packets every pipe (I think - I haven't really looked too hard) however that's not possible for my design of pipes - as the server only knows which direction the item will take out of the pipe after it reaches the centre, not as soon as it enters (like in 1.7).

In the future I'd like to look into "grouping" pipe sections together (for example if you've got 16 quartz pipes only connected in a line (or corners - just no 3-plus connections in a single pipe) and then only send packets once for the entire length, rather than once for every half-block length. But that needs a lot more work to be able to accomplish properly so I'm probably going to leave it alone for now.

I think that's it? (Or at least, does that answer some of your points)?

Speiger commented 4 years ago

Huh interesting approach. Why did you drop the Supplier from BC1.12? Which would be just a holder to ensure if it isnt present the server gets time to deliver stuff.

However that is usually optimised-out because libnetworkstack buffers a lot of packets together until the end of the tick.

  • it's libnetworkstack ID (an integer - 4 bytes - although this could probably be changed to a var int as it's generally unlikely that you'd ever get millions of packet types).
  • the pipe's position (3 var ints - not 2 plus a byte so that we get cubic chunks compat)
  • the index of the pipe part in the multipart container (1 byte) (which is technically not needed because you can only ever get one pipe per blockpos).

Ok yeah thats interesting. Though for the X & Z Coordinate you should use the Normal int, do to the "negative bit" penalty which X & Z have no matter what (unless you play in a positive only world). (Since its the last bit. So Big/Little Edian dont help) Also instead of sending over the "start & runtime" why not send over the speed in the beginning and caclulate the time of destination yourself? Assuming its a byte (1-100 should be more then enough room) Since the ServerClock Syncs the client clock anyways the only difference you get is a lag spike which would fix itself the moment the Item hits the next segment. That would kill 9 bytes right out of the gate.

Anyway, on to the response: The client doesn't use this, so it's not sent.

AlreadyAttempted Directions(6 bits)

Eh yeah thats true. It might not be needed. I use it right now to determin if the item should be killed right out of the gate from the client or not so the packet gets saved.

In old BC the server only send packets every pipe (I think - I haven't really looked too hard) however that's not possible for my design of pipes - as the server only knows which direction the item will take out of the pipe after it reaches the centre, not as soon as it enters (like in 1.7).

Yeah I was fully aware of that fact though 1.12 seem to have at least a server queue. On a Per tick basis.

In the future I'd like to look into "grouping" pipe sections together (for example if you've got 16 quartz pipes only connected in a line (or corners - just no 3-plus connections in a single pipe) and then only send packets once for the entire length, rather than once for every half-block length. But that needs a lot more work to be able to accomplish properly so I'm probably going to leave it alone for now.

My solution was have a packet batcher on a chunk basis so I just send 1 packet per chunk (so only tracking players receive it) and can use 12 byte + (2 byte pipeCount) + (ItemData (3 byte) ItemCount) + (stacked size overhead). Which if i have 100 items in total for 10 pipes is: without overhead 233 bytes + 11 bytes overhead do to arraysize transfer. (note that this is update data not, fresh data, fresh would be a bit more since it has 6-14 byte per item instead of 3-7)

And yes that answered a lot of my questions and this might already improve my logic a bit more. Thanks for the tips. I hope that what I provided helps to provide a few new view angles.

Edit: Fixed my math calculation since i missed a 0 ^^" Now its accurate

AlexIIL commented 4 years ago

Huh interesting approach. Why did you drop the Supplier from BC1.12? Which would be just a holder to ensure if it isnt present the server gets time to deliver stuff.

The server sends the packet individually, so it stores the "item -> id" cache per-player, which means it can just send the id -> item mapping if it hasn't already sent it to that client. (Having a request/response system will show the default item for the round-trip-time: which is generally ok in singleplayer, but much worse over the internet as it's potentially noticeable).

Ok yeah thats interesting. Though for the X & Z Coordinate you should use the Normal int, do to the "negative bit" penalty which X & Z have no matter what

...ah. I haven't noticed that before, I'll have to fix that :/ (Fortunately it's a small change to LNS to make this work properly).

Also instead of sending over the "start & runtime" why not send over the speed in the beginning and caclulate the time of destination yourself?

I'll try it later :)

(Although just sending the "runtime" should be smaller than "speed" as speed is a double, rather than a byte/short).

My solution was have a packet batcher on a chunk basis so I just send 1 packet per chunk...

That would make sense! I do it by batching all LNS packets into one per-player per-tick, as then we never run into a situation where we are trying to send less bytes than the minimum per ethernet frame (about 46B?) and still write out multiple packets after that. (Although this only applies for versions of minecraft that flush after every packet - which caused quite a slowdown when I didn't batch packets)

Speiger commented 4 years ago

The server sends the packet individually, so it stores the "item -> id" cache per-player, which means it can just send the id -> item mapping if it hasn't already sent it to that client. (Having a request/response system will show the default item for the round-trip-time: which is generally ok in singleplayer, but much worse over the internet as it's potentially noticeable).

That could be solved by sending in the intial login the packet of all known items in the first place and no that isnt much network bandwidth. For 1 reason: MC sends on load all itemstacks for all recipes over and also for all tags too. So having to send a list of varInts & ItemStacks in the beginning that can be much less then recipes is not that much. (Though it might be enough to send it over the edge for connections that are close for timeout)

I'll try it later :) (Although just sending the "runtime" should be smaller than "speed" as speed is a double, rather than a byte/short).

yeah why isnt it a byte in the first place? Shouldnt 256 values be enough? I mean -100 to 0 is centering and 0 to 100 is leaving center. That should be more then enough since BC already caps anything thats bigger then a 128 (byte). And i use -100 0 100 so that the speed can never overshoot and cause repeats. (Also values are converted to int in math calculation because java & Math.min)

That would make sense! I do it by batching all LNS packets into one per-player per-tick, as then we never run into a situation where we are trying to send less bytes than the minimum per ethernet frame (about 46B?) and still write out multiple packets after that. (Although this only applies for versions of minecraft that flush after every packet - which caused quite a slowdown when I didn't batch packets)

Huh. Interesting. Yeah there is multiple appraches. I think combo solution would be good here too. Instead of sending packets per chunk it should be in the beginning batched by chunk but when it is send to players it is combined to on a per player basis. Why? ChunkBatching would be faster on the TileEntity Performance because you are doing only Map.get() with bitshifted numbers. To support cubic chunks you could easily use a Int2ObjectMap instead of a ShortToObjectMap (first 8 bits are x&z 24 bits for Y level, i doubt anyone will exceed 16.8 Million Height) And when the sorting per player comes into play you only need to iterate over the chunkmap once and use the PlayerList.TrackedChunk entry to get the players from there and just add the maps into player specific lists. (Ofc in 1.14 its different but you get the point). That way you have double batching.