Open TheJohnLong opened 8 years ago
CrystalNetwork CME...
How do the Crystal Repeaters check to see if they are connected to their respective node?
Seems to happen after exploring a different dimension, all repeaters disconnect from their pylons,
added mods list
Opening a single player world to LAN seems to cause the first error (NPE)
All problems cease to occur after moving to a dedicated server enviroment
Is Optifine installed? Is its MultiThreaded Chunkloading enabled?
Optifine is installed, multi-core chunk loading is not enabled
Try removing Optifine as a test.
This error still occurs without OptiFine
Everytime I jump to a different Dimension be it CC Dim, Nether, End the Repeater network dies, same exact error messages(I am not sure if Spawn Chunk Pylons are loaded at all, haven't had much time to test)
I haven't had any crashes but I have had a few repeater connectivity issues. V16c. I have a chain of Zambarau repeaters linked to a pylon ~700m out, and rather inconsistently, if I whack a Repeater with a Manipulator I get the 'no connection' message. There may be an interaction with when a pylon is depleted, but it gets confusing when a Repeater tells me 'no connection'... while it is actively transferring energy. Which has happened with the same chain.
Then there was another really weird issue I had with the same repeater chain where I made the chain, and it connected, and I did some Rituals with the power, and then a few days and several (SSP) server restarts later, the chain wouldn't connect, the pylon was full, and I couldn't get the first repeater to connect until I broke it and moved it one block closer to the pylon.
Oh, I also got a CME crash from the pylon network. Pretty sure it was related to a Casting Table trying to draw from the net... and that same damn Zambarau line. http://hastebin.com/awoxeqapiy.php
I hit four of these CME crashes last night and I think I found a cause. Related points: world loading/unloading, Pylon chunkloading when tapped; having 2 pylons feeding the same Casting Altar.
When first loading into the world (SSP), only the pylons close to my base can be tapped by the net. In order to tap the rest, I need to quickly fly out and chunkload them (while the Altar is requesting that color power). After that, the pylons stay loaded.
However, if there are two pylons of the same color both attached to the network, either both loaded or one loaded already and one loaded later,
TL;DR: Two pylons of the same color attempting to feed power into a single Casting Altar causes CME. (Example: http://pastebin.com/k3gPKkTe)
Copied post from https://github.com/ReikaKalseki/Reika_Mods_Issues/issues/1299 as searching for the error did not reveal any matching issue in github, so let's add it here so searching actually succeed:
Chromaticraft/DragonAPI/ReikaOtherMods Version 16d.
Spent some time figuring out a way to reliably reproduce this, got it hitting every time with only Reika's mods, thaumcraft, some thaumcraft addons, NEI, and railcraft installed.
I can reproduce this by connecting to pylons that are outside of the player loaded range, loading up the single player game, trying to craft something. It does not reproduce if I visit each of my connected pylons first, but until they are visited this will happen about 50% of the time when crafting a recipe that uses lumen energy. Interestingly the times that it does not crash then the colors that I did not visit will not work until they are visited (to within the player loaded distance) then continue to work without issue until the game is restarted. Attempted a variety of things to fix it around testing chunkloaders at the pylons, at the pylons and mostly along the repeater paths to no avail, it seems that I have to either visit the pylons or at least visit the path of repeaters along the whole distance. I've yet been unable to verify that every repeater has gotten chunkloaded so that may be the issue, but everything I've read about this implies that no chunkloading of anything should be necessary, and indeed thus far chunkloading has not resolved the issue.
From looking over the file https://github.com/ReikaKalseki/ChromatiCraft/blob/master/Magic/Network/CrystalNetworker.java it appears that the class field of tiles is mutated while iteration is happening. At the crash happened on https://github.com/ReikaKalseki/ChromatiCraft/blob/master/Magic/Network/CrystalNetworker.java#L608 then the only lines that could cause that are either https://github.com/ReikaKalseki/ChromatiCraft/blob/master/Magic/Network/CrystalNetworker.java#L612 (te.canConduct()
) or https://github.com/ReikaKalseki/ChromatiCraft/blob/master/Magic/Network/CrystalNetworker.java#L613 (te.isConductingElement(e)
) and both of those call in to the CrystalSource
interface, which ends up branching into a variety of other classes, and this is assuming that this whole thing is not multi-threaded (which if it is then it is sorely missing some synchronization primitives). The TileEntityCache
from DragonAPI also does nothing to ensure that concurrent access cannot happen (changing a collection while it is being iterated), for example its keyset returns a immutable view to the main data source but does not copy it nor set any flag to ensure that changes cannot be made while the iteration is in progress. I stopped researching here however, it is not my code. ^.^
Stacktrace:
Crash Report ----
// Surprise! Haha. Well, this is awkward.
Time: 1/14/17 8:23 PM
Description: Ticking block entity
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1042)
at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1042)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.getAllSourcesFor(CrystalNetworker.java:608)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.anyConnectedSources(PylonFinder.java:151)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.findPylon(PylonFinder.java:136)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.findPylon(PylonFinder.java:125)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.makeRequest(CrystalNetworker.java:261)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.makeRequest(CrystalNetworker.java:248)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergy(CrystalReceiverBase.java:83)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergy(CrystalReceiverBase.java:94)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergyDifference(CrystalReceiverBase.java:101)
at Reika.ChromatiCraft.TileEntity.Recipe.TileEntityCastingTable.onCraftingTick(TileEntityCastingTable.java:258)
at Reika.ChromatiCraft.TileEntity.Recipe.TileEntityCastingTable.updateEntity(TileEntityCastingTable.java:151)
at Reika.DragonAPI.Base.TileEntityBase.func_145845_h(TileEntityBase.java:490)
at net.minecraft.world.World.func_72939_s(World.java:1939)
at net.minecraft.client.Minecraft.func_71407_l(Minecraft.java:2006)
at net.minecraft.client.Minecraft.func_71411_J(Minecraft.java:973)
at net.minecraft.client.Minecraft.func_99999_d(Minecraft.java:898)
at net.minecraft.client.main.Main.main(SourceFile:148)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at net.minecraft.launchwrapper.Launch.launch(Launch.java:135)
at net.minecraft.launchwrapper.Launch.main(Launch.java:28)
A detailed walkthrough of the error, its code path and all known details is as follows:
---------------------------------------------------------------------------------------
-- Head --
Stacktrace:
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1042)
at java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1042)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.getAllSourcesFor(CrystalNetworker.java:608)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.anyConnectedSources(PylonFinder.java:151)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.findPylon(PylonFinder.java:136)
at Reika.ChromatiCraft.Magic.Network.PylonFinder.findPylon(PylonFinder.java:125)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.makeRequest(CrystalNetworker.java:261)
at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.makeRequest(CrystalNetworker.java:248)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergy(CrystalReceiverBase.java:83)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergy(CrystalReceiverBase.java:94)
at Reika.ChromatiCraft.Base.TileEntity.CrystalReceiverBase.requestEnergyDifference(CrystalReceiverBase.java:101)
at Reika.ChromatiCraft.TileEntity.Recipe.TileEntityCastingTable.onCraftingTick(TileEntityCastingTable.java:258)
at Reika.ChromatiCraft.TileEntity.Recipe.TileEntityCastingTable.updateEntity(TileEntityCastingTable.java:151)
at Reika.DragonAPI.Base.TileEntityBase.func_145845_h(TileEntityBase.java:490)
at net.minecraft.world.World.func_72939_s(World.java:1939)
-- Block entity being ticked --
Details:
Name: CCchroma.table // Reika.ChromatiCraft.TileEntity.Recipe.TileEntityCastingTable
Block type: ID #2005 (tile.ChromaticTile // Reika.ChromatiCraft.Base.BlockChromaTile)
Block data value: 3 / 0x3 / 0b0011
Block location: World: (21,70,316), Chunk: (at 5,4,12 in 1,19; contains blocks 16,0,304 to 31,255,319), Region: (0,0; contains chunks 0,0 to 31,31, blocks 0,0,0 to 511,255,511)
Actual block type: ID #2005 (tile.ChromaticTile // Reika.ChromatiCraft.Base.BlockChromaTile)
Actual block data value: 5 / 0x5 / 0b0101
Stacktrace:
at net.minecraft.client.Minecraft.func_71407_l(Minecraft.java:2006)
I found other locations where the TileEntityCache is accessed and mutated with no checks being done of re-entrancy. If re-entrancy is detected (say via a flag) then mutations should be cached for adding afterwards once the re-entrancy is exited, that should fix the issue. Or better yet use an immutable data structure since this is not a data structure that changes often anyway, would give faster lookups as well if structured properly.
I hadn't had this issue for a while, but now I'm having it again since I'm trying to make something that takes all of the colors.
@OvermindDL1 Due to MC's singlethreaded nature, a network modification - which can only happen from setBlock() - should never concur with anything actually using the crystal network; a network request (always done from inside a TileEntity update, and ticked in a TickHandler) simply cannot occur concurrently with another world change unless there are multiple threads simultaneously ticking or modifying things.
It appears that this is mainly happening when I go out to some of my further repeaters, but this shouldn't cause multiple pylons of the same color to be on the network since I don't often use omni repeaters, and I'm pretty sure I don't have any of the same pylon grouped close together.
@OvermindDL1 Due to MC's singlethreaded nature, a network modification - which can only happen from setBlock() - should never concur with anything actually using the crystal network; a network request (always done from inside a TileEntity update, and ticked in a TickHandler) simply cannot occur concurrently with another world change unless there are multiple threads simultaneously ticking or modifying things.
A ConcurrentModificationException has nothing to do with multiple threads (well it could, but in 99% of contexts it does not), and nor does it in this case either. What causes it is:
.next()
on it (or implicitly via a for
statement), this will Boom -> ConcurrentModificationExceptionFrom the looks of it, while the network is being iterated it goes around touching various things, repeaters, pylons, etc... and something, one of those things it touches ends up calling into something that calls into something that then mutates the cache, it then returns back out and the iteration continues, it is at this point Boom
.
A good way to test 'what' is causing it is to catch the CME and print out the last thing you just iterated over (not the current next()
call (the one that crashed), but the last successful next()
result as calling into 'that' thing somehow created this whole situation).
EDIT:
@OvermindDL1 Due to MC's singlethreaded nature, a network modification - which can only happen from setBlock() - should never concur with anything actually using the crystal network; a network request (always done from inside a TileEntity update, and ticked in a TickHandler) simply cannot occur concurrently with another world change unless there are multiple threads simultaneously ticking or modifying things.
Also, I know minecraft's code exceptionally well, I guess I am not recognized. ;-)
Also, from the official docs: https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html
(Bolding is my emphasis)
This exception may be thrown by methods that have detected concurrent modification of an object when such modification is not permissible. For example, it is not generally permissible for one thread to modify a Collection while another thread is iterating over it. In general, the results of the iteration are undefined under these circumstances. Some Iterator implementations (including those of all the general purpose collection implementations provided by the JRE) may choose to throw this exception if this behavior is detected. Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that this exception does not always indicate that an object has been concurrently modified by a different thread. If a single thread issues a sequence of method invocations that violates the contract of an object, the object may throw this exception. For example, if a thread modifies a collection directly while it is iterating over the collection with a fail-fast iterator, the iterator will throw this exception.
Note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast operations throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: ConcurrentModificationException should be used only to detect bugs.
Still happening on V18b. Though I have visited the relevant pylons (kijani and ykri judging by color, Magnetism ritual) in that session.
Judging by OvermindDL1's stuff and how the crash is avoided, it is caused by the repeater network finding a pylon, starting to draw from it, then/also finding a separate pylon of the same color and trying to draw from that. At a wild guess, it searches simultaneously, finds one pylon and modifies the network, but the second search expects the network to be unmodified and CMEs.
Alternately could be trying to draw leylines twice through the same repeater.
Additionally it happens if network is loaded simultaneously with requests, e.g. first-load flying around to chunkload the pylons while table is crafting something with lumens. It shouldn't be caused by my lumen tree because I spent some time before with it running with no crashing.
Well whaddya know, 2 years later I come back to Chromaticraft and the same issue is still happening, Has any progress been made on fixing it?
@DFliyerz The CHC pylon network code is still re-entrant, which is what causes this error. Reika needs to fix it either by hoisting all calls to not be re-entrant or by re-archetecting it (which the easiest way there would be immutable storage types and then code around that).
re-entrant
I have no idea what that even means.
The general definition of re-entrant is returning back into oneself. For math it is when an angle bends back to the main structure. In programming it refers to recursivity, it is the action of a function A calling function B calling function C calling function # calling back to function A safely and without fault (no mutating of shared data, no global mutable data, etc...). In this specific case it means that the structure that holds the pylon network data is touched by something while that pylon network data is being iterated (like iterating the pylon network, calling into TE's or whatever, and those call back into the pylon network somehow while in mid-iteration), which the java containers detect and thus throw the CME saying that something very very bad was done. This is only an issue with mutable structures, it is not an issue with immutable structures as those are mutated and iterated atomically (I.E. immutable structures have fewer bugs and are faster for iteration but slower for updating, which is a perfect usecase for the pylon network as the 'shape' of the network changes very very rarely but lookups are very common).
I have no idea what you are talking about.
(you should probably give an example)
Here @ReikaKalseki, here is a code version of everything I described above:
╰─➤ cat CMEExample.java
import java.util.*;
public class CMEExample {
static public void main(String[] _args) {
List<Integer> integers = new ArrayList<Integer>();
integers.add(42);
integers.add(16);
integers.add(1);
integers.add(2);
integers.add(3);
int loop = 0;
for (Integer integer : integers) {
System.out.println(loop++);
integers.add(1);
}
}
}
Now you'd think this would infinite loop while printing out an ever increasing number, but instead it prints out:
╰─➤ javac CMEExample.java
╰─➤ java CMEExample
0
Exception in thread "main" java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
at java.util.ArrayList$Itr.next(ArrayList.java:859)
at CMEExample.main(CMEExample.java:12)
It prints out a 0
then it CME's, exact same reason as this issue.
In java (well in any language) if you MUTATE a MUTABLE collection (arraylist, hashmap, whatever) while there is an active ITERATOR then the iterator becomes invalid. In C++ this invalid iterator will crash at debug time with an appropriate message but will silently corrupt data at runtime. In Rust this code won't even compile. In java you get a ConcurrentModificationException.
Now 'Why' this happens is that mutating a mutable collection can incur memory allocation and movement, the iterator is thus pointing to the 'old' memory, which has since been reclaimed (not as in a GC way but as in the content has been destructively moved to the new location), the iterator handles this in whatever language way it wants, in Java this is by throwing a CME.
'How' this happens in Chromaticraft is a few different ways, but it all comes down to a collection being modified while there is an active iterator for it somewhere either deeper in the call tree (the at Reika.ChromatiCraft.Magic.Network.CrystalNetworker.getAllSourcesFor(CrystalNetworker.java:590)
exception location) or via receiving a packet and the information being updated concurrently (like with the t Reika.ChromatiCraft.World.IWG.PylonGenerator.getNearestPylonSpawn(PylonGenerator.java:298)
exception). In all cases this is because the invariant Rule of:
DO NOT MUTATE A MUTABLE COLLECTION WITH AN ACTIVE ITERATOR
Is violated. Sadly the codebase for chromaticraft is structured in such a way so as to make it difficult to follow the calltree instead of using more appropriate patterns.
There are a variety of fixes for this issue:
The Easy Fix: Make all of the collections that have ever appeared in such a report Immutable instead of mutable. Now this does not mean you won't be able to change them, rather it will just mean that you have to replace the collection with an updated one every time you change it, this will keep iterators working on the old instance still valid until they complete and close. This is not a costly method as adding/removing pylons is a very rare task and it will in fact speed up iteration as well.
The Harder Fix: Rewrite or Adjust a good chunk of all of the pylon handling code to make sure that mutations happens to all relevant collections out of band of any tick handler that could potentially access it, such as a followup transaction commit pass. This would be quite fast and will still keep the mutable collections as it seems you are used to them (I've not seen a single sign of an immutable collection in the entire codebase to date, this in itself is a very very bad sign).
The Proper Fix is of course to make those collections immutable as they should have been from the beginning. I'm not sure why mutable collections were chosen to begin with considering how rarely added/removed the pylons are...
I know what a CME is; I simply am not familair with jargon terms like "re-entrant".
Also, I have no idea where the conflicting modification comes in; so many things can modify the network, most of which ultimately get called from vanilla code (eg chunk loading from disk or TileEntity creation and mapping to a position).
Also, I have no idea where the conflicting modification comes in; so many things can modify the network, most of which ultimately get called from vanilla code (eg chunk loading from disk or TileEntity creation and mapping to a position).
Exactly, that's why changing it to be an immutable structure would fix the issue. I.E. to update it (add/remove an element) will involve replacing the whole structure with the updated version, but as stated, pylon counts don't change all that often, and considering iteration would be slightly faster it more than offsets it anyway. :-)
If you are curious as to the cause though, on the next version (perhaps behind a debug config?) you can gate all methods that access it in a flag that you set when any iteration on it is being done and unset when it's not being done, and then throw if that flag is set at that point, that would give you the full stacktrace of where the cause is for that specific instance (could be multiple). Though to fix that would still require either switching to an immutable container or to run a finalization pass by registering a 'task' to perform at the end of the iterations or full tick.
I am not convinced of your claim that the network does not change often enough to make immutable collections expensive. The fact that these CMEs occur regularly - at least for the people who have them - clearly indicates otherwise. And pylons and other repeaters can be loaded in massive numbers as a world is still "fresh" off loading from disk.
I am not convinced of your claim that the network does not change often enough to make immutable collections expensive.
Even if the immutable collection was changed every frame, and considering that java uses either small primitives or pointers to objects in arrays, then adding/removing something every-single-frame accounts for 2 memcpy's, which if it is in the middle of 1024 elements takes... (benchmarking...) 61.016 microseconds on my 9 year old machine with a 761.68 nanosecond error bar. I.E. it's not going to matter even if you do it every frame.
The fact that these CMEs occur regularly - at least for the people who have them - clearly indicates otherwise.
Regularly != every frame, rather it equals once during a gaming session during some of the random time that a pylons loads out of the very very few times that pylons actually load (1024 I'd bet is still a rather high pylon count in the world, but even a million would be fine with immutable collections).
And pylons and other repeaters can be loaded in massive numbers as a world is still "fresh" off loading from disk.
All trivial, even if ten thousand get loaded at a rate of once per frame it would still never be noticed.
The algorithms used are significantly more important in a program than a little memory copying, and immutable collections are O(N) on insertion and removal (because of the memcpy) and O(1) on everything else if it is an array backend (other things like O(N) on insertion and removal and O(N log N) on immutable hash lookups, but still O(1) on iteration).
Wait wait...O(N log N) on immutable hash lookups
as in the hashmap lookups go from O1 (in the current implementation) to that?
Er, my mistype, O(1) on lookups (though slower than arrays). I was thinking trees because of other code I was writing and my brain was on that. ^.^;
In that case I can try immutable collections, but I have no idea how.
In that case I can try immutable collections, but I have no idea how.
If you have any questions feel free to ask, I'm often on IRC as well and can respond as often as work allows me to.
Do you have anything else that is not both incredibly obscure and not amenable to visual aids?
Hmm, IRC is one of the most used communication networks out still so it's not really obscure. As for visual aids, I use google hangouts for personal things, can message me via my expected gmail with my username here.
Interesting info on this error report,
I was away from my computer when the game suddenly just crashed.
I do not have a network setup at all I just unlocked Crystal Repeaters and tore down the last few wooden repeaters I still had up after finally getting one to pop. (I know this is a old version, but error happened a few days before the newer releases)
Interesting thing is the ASM listener trace.(These only show up in MultiMC some strange reason the messages don't appear in the normal minecraft crash log.)
If you want to be lazy you could make "player" pylons always chunkloaded.
I am just moving conversation to this thread so that information is easier to be accessed.
It never happens on dedicated server, for reasons none has been able to determine
@ReikaKalseki
As discussed on issue #2439 , this will not occur on a server hosting Chromaticraft in the modpack. Also the time I first hit the bug in 24c to current version 26b, playing in a singleplayer world, I have at least 108 duplicate CME crash report.
That also implies a potential cross-thread ownership violation as well...
@OvermindDL1
https://pastebin.com/jtM5qKGB Maybe it doesn't crash servers because it is a client sided render thingy
That particular call is, but in general the network is prone to CMEs, and these can be triggered by any access to it.
I have noticed that in sp, unloading chunks with (crc) stuff working in them seems to dramatically increase the chance of a cme
or it happens when a server is being "overloaded" in sp. . . (not sure what load flying at full speed with a maxed out jetpack with jetfuel has, but I think it would be a lot, and it consistently cme's if I just bolt out 2k blocks with it lol)
Well it's been 6 years since I last posted in this thread, but it's definitely still an issue. I've tried loading all the chunks with pylons and it still isn't really helping, but for some reason if something using the repeater network fails and causes a crash it usually works as soon as I launch the game again. Here's one of the logs: crash-2023-04-29_21.58.54-server.txt
Yes, the issue is still an issue, hence why this is still open.
Started happening during world in a custom pack, even when I fly out and check the repeaters(directly next to the pylons), they give the "not connected" sound (The repeaters have connected to the pylons before) I get this error in the logs when this occurs http://pastebin.com/a1cE3MUQ (edit)
A few minutes of faffing around attempting to see if I could get the repeater network to work, and many of the pylons seemed to work now, with the exception of a few(with no changes to any block placement), when I attempted to use the ritual table (via a right click with the elemental manipulator) I was greeted with this crash http://pastebin.com/ESJam5Ws
(edit) Upon world reload, no pylons except the closest are connected to the network, and whenever I click on the repeaters at my base (and I assume all of the others that aren't connected) I get the first error again.
also, modlist http://pastebin.com/PmXbZhVJ