Closed Quillraven closed 1 year ago
Btw the publish workflow seems to get triggered even for PRs. Need to have a look at that but I think I already found a solution for it.
@Quillraven yeah that sounds like a sensible rename!
@Quillraven Yes, I will. :)
@aSemy : any idea why "Publish" workflow gets triggered for PRs? I thought the if condition in the job should prevent that? Obviously, I am wrong :D
@aSemy : any idea why "Publish" workflow gets triggered for PRs? I thought the if condition in the job should prevent that? Obviously, I am wrong :D
Ooh that's not very good! I don't know why that might be, I thought the condition would work too! I'll take a look...
EDIT: see #108
Ok, from my side. The new onAdd
and onRemove
component functions work for me :)
Currently in KorGE-Fleks I cannot use the data objects
from Kotlin 1.9 since kproject sets up the KorGE project with 1.8.x. But that should not be a problem for Fleks. We just need to update KorGE'S kproject to support also 1.9.
Ok, from my side. The new
onAdd
andonRemove
component functions work for me :)Currently in KorGE-Fleks I cannot use the
data objects
from Kotlin 1.9 since kproject sets up the KorGE project with 1.8.x. But that should not be a problem for Fleks. We just need to update KorGE'S kproject to support also 1.9.
Great, thanks for checking! It is interesting to me, that you have problems with data objects
. I will check that myself on the weekend in one of my projects. If this is not downwards compatible, shall we stick with 1.8.xx version for now?
data object
was lately inroduced with Kotlin 1.9. Shouldn't be a big deal to update KorGE to 1.9. Soywiz is usually adapting new Kotlin versions quickly in KorGE.
But would be nice to be backwards compatible with Kotlin 1.8.xx just for some time. Maybe you can do a branch 2.4-kotlin1.8.xx
after 2.4 is released which does not use data object
. The rest you do not need to change since in KorGE-Fleks we use local gradle config to build Fleks.
@jobe-m: What kind of error are you actually getting? I just tried it in a project with Kotlin 1.8.22 and I can run it just fine.
I am getting The feature "data objects" is only available since language version 1.9
.
Are you sure that you build Fleks also with Kotlin 1.8 or did you only build the project with 1.8 but Fleks is build with 1.9?
I am getting
The feature "data objects" is only available since language version 1.9
. Are you sure that you build Fleks also with Kotlin 1.8 or did you only build the project with 1.8 but Fleks is build with 1.9?
I don't built Fleks at all. I just add it as a dependency to my projects. I use 2.4-SNAPSHOT from maven.
Building Fleks of course won't work with 1.8 but why do you have to build it?
With Soywiz's kproject we can add external modules as source delivery to a KorGE project. The benefit is that Fleks can be used on all fancy (homebrew) platforms which KorGE is available for. Thus Fleks is build with the build configuration of KorGE.
Please don't wait with 2.4 for me. I am currently unfortunately heavily loaded with private family stuff while trying also to survive at business. I won't have much time left the next month's. So my progress will be rather slow.
Ah understood - didn't know about the KorGE setup but then it makes sense. In this case, as you suggested, I will make a 2.4_8 branch with Kotlin 1.8 after the release.
And I totally understand the family and work stuff - I have similar problems at the moment ;)
Anyway, thanks for the info and updates! I will then release 2.4 this weekend.
Thanks for recommending gdx-liftoff! Would you like Fleks added to the third-party extensions in Liftoff? I don't think you need to do anything.
I decided to run the benchmark with lower entity count and update count (each 1/10 of what it is now in the repo), and I got somewhat different results. Not having "fractionally complete" results (less than 1.0 throughput) for the complex benchmark helps both artemis and, at least a little, ashley. I ran this with Java 20 on a slightly newer laptop processor than the one used for the listed benchmarks, and Fleks' add and remove performance on here is vastly better than any competitor (225% more throughput than artemis-odb); I don't know why exactly. It's a tiny bit slower on both simple and complex with the smaller entity count, which suggests there might be an algorithmic complexity difference that makes fleks relatively faster on larger entity collections. Artemis is quite a bit faster on the complex benchmark with the smaller entity count (68% more throughput than fleks).
$ C:\d\jvm\jdk20-hotspot\bin\java -jar Fleks-jvmBenchmarks-jmh-2.5-SNAPSHOT-JMH.jar -w 10 -r 10 -wi 10 -i 10 -t 1 -f 1 -gc true
...
Benchmark Mode Cnt Score Error Units
ArtemisBenchmark.addRemove thrpt 10 4071.356 ┬▒ 337.725 ops/s
ArtemisBenchmark.complex thrpt 10 178.795 ┬▒ 3.415 ops/s
ArtemisBenchmark.simple thrpt 10 9047.583 ┬▒ 62.516 ops/s
AshleyBenchmark.addRemove thrpt 10 2269.431 ┬▒ 42.238 ops/s
AshleyBenchmark.complex thrpt 10 16.279 ┬▒ 0.199 ops/s
AshleyBenchmark.simple thrpt 10 2086.183 ┬▒ 68.274 ops/s
FleksBenchmark.addRemove thrpt 10 13264.218 ┬▒ 34.791 ops/s
FleksBenchmark.complex thrpt 10 106.155 ┬▒ 2.879 ops/s
FleksBenchmark.simple thrpt 10 7433.124 ┬▒ 64.739 ops/s
This only shows that Artemis is faster than fleks for the simple and complex cases with 100 entities; it isn't ever faster on addRemove that I can see. Whatever you're doing for addRemove is great, and if there's a way to do something similar with what simple and complex use, they would probably outpace Artemis in more cases.
I initially ran the benchmark because I wanted to make sure the methodology was sound, but the benchmark code looks solid. Benchmarking is such a tricky thing to get right because there are so many factors involved...
EDIT: I also ran the benchmarks with Shenandoah GC enabled, to see if garbage collection might affect things.
$ C:\d\jvm\jdk20-hotspot\bin\java -XX:+UseShenandoahGC -jar Fleks-jvmBenchmarks-jmh-2.5-SNAPSHOT-JMH.jar -w 10 -r 10 -wi 10 -i 10 -t 1 -f 1 -gc true
...
Benchmark Mode Cnt Score Error Units
ArtemisBenchmark.addRemove thrpt 10 3660.630 ┬▒ 596.098 ops/s
ArtemisBenchmark.complex thrpt 10 124.183 ┬▒ 1.215 ops/s
ArtemisBenchmark.simple thrpt 10 8220.601 ┬▒ 240.047 ops/s
AshleyBenchmark.addRemove thrpt 10 1770.910 ┬▒ 8.321 ops/s
AshleyBenchmark.complex thrpt 10 14.689 ┬▒ 0.122 ops/s
AshleyBenchmark.simple thrpt 10 2154.914 ┬▒ 61.261 ops/s
FleksBenchmark.addRemove thrpt 10 11708.724 ┬▒ 107.808 ops/s
FleksBenchmark.complex thrpt 10 101.074 ┬▒ 0.912 ops/s
FleksBenchmark.simple thrpt 10 6607.135 ┬▒ 48.420 ops/s
Shenandoah GC seems to hurt performance here, which surprised me.
@tommyettinger: funny coincidence because I wanted to open an issue soon for gdx-liftoff and ask, if you could add this library. To answer your first question: yes please, would be great if it can be added directly :)
Regarding performance: I am personally confused by the performance measurement of JVM languages. On my notebook (AMD processor) Fleks is always faster in addRemove and simple benchmark and in the complex one it is close. On my PC (old Intel CPU), which was used for the benchmarks in the README.md Fleks is still faster with addRemove but in simple benchmark it is random. Sometimes faster, sometimes slower.
I don't get it 100% but in the end I didn't care that much to investigate and understand it. Important for me was, that it is faster than Ashley and relatively close to Artemis.
To give a little bit of insight:
1.6-JVM
branch still available. bag
for faster iteration.contains
function check. Since that data is based on components, which have a fixed size, it might be possible to speed that up.Archetype
approach in ECS where you can basically skip the entire contains
function of the family. In that approach, if I understood it correctly, you use more memory because a component gets added multiple times to different archetypes it belongs, but it has the performance advantage since the expensive contains
checking is no longer necessary and also the conversion from BitArray to Bag is no longer needed. This approach however requires a bigger change in the library, I guess BUT it will be a lot faster with the downside of using more memory.Unsafe
calls, which are not possible in Kotlin multiplatform afaik. I think it also uses the Archetype
approach. I personally don't like the syntax of that library but it is for sure worth it to check why it is so fast (if the claims are true ;) ).I can run some more benchmarks later tonight, including ones that take hours if needed. I think the random-seeming aspect of the benchmarks has a lot to do with the OS, any existing processes running on the machine, how much time the JIT compiler has had to get things ready, etc. It's best to reduce the amount of error by running more iterations and for longer (I used 10 seconds, ideally I think I would try a minute per iteration when using many entities). I suspect GC may be a factor here because switching to Shenandoah had a (negative) performance effect, though it seemed to hurt all the benchmarks.
BitArrays are amazing data structures and I love them. 😄 If you aren't currently using bitwise operations to do steps in bulk, that would be something to consider, though I imagine you probably are doing things like masking heavily already. I suspect the conversion to a bag would be the main slow point there, or any GC that needs to be done to clean up temporary bags or bit arrays. It might be worth it to consider iterating while still in a BitArray -- if each block of bits is stored as a Long
, at least on the JVM there are some very fast intrinsic candidates to work with that Long
and its bits. If you want to get the lowest bit set in num
, that doesn't even need a method call; it's just num and -num
(I'm not 100% sure of the Kotlin syntax here, it's a bitwise AND operator).
Would be interesting to find out the code segments that take up the most time in simple/complex benchmarks and go from there. Not sure if your way of benchmarking can do that. The latest IntelliJ version contains benchmark inlay infos but I don't have experience with them yet. But might be useful as well ;)
My assumption is that following methods are the bottlenecks (family.kt and bitArray.kt):
/**
* Returns true if the specified [compMask] matches the family's component configuration.
*
* @param compMask the component configuration of an [entity][Entity].
*/
internal operator fun contains(compMask: BitArray): Boolean {
return (allOf == null || compMask.contains(allOf))
&& (noneOf == null || !compMask.intersects(noneOf))
&& (anyOf == null || compMask.intersects(anyOf))
}
/**
* Updates this family if needed and runs the given [action] for all [entities][Entity].
*
* **Important note**: There is a potential risk when iterating over entities and one of those entities
* gets removed. Removing the entity immediately and cleaning up its components could
* cause problems because if you access a component which is mandatory for the family, you will get
* a FleksNoSuchComponentException. To avoid that you could check if an entity really has the component
* before accessing it but that is redundant in context of a family.
*
* To avoid these kinds of issues, entity removals are delayed until the end of the iteration. This also means
* that a removed entity of this family will still be part of the [action] for the current iteration.
*/
inline fun forEach(crossinline action: Family.(Entity) -> Unit) {
// Access entities before 'forEach' call to properly update them.
// Check mutableEntities getter for more details.
val entitiesForIteration = mutableEntities
if (!entityService.delayRemoval) {
entityService.delayRemoval = true
isIterating = true
entitiesForIteration.forEach { action(it) }
isIterating = false
entityService.cleanupDelays()
} else {
isIterating = true
entitiesForIteration.forEach { this.action(it) }
isIterating = false
}
}
// --> which also includes a call to
// This bag is added in addition to the BitArray for better iteration performance.
@PublishedApi
internal val mutableEntities = MutableEntityBag()
get() {
if (isDirty && !isIterating) {
// no iteration in process -> update entities if necessary
isDirty = false
entityBits.toEntityBag(field)
}
return field
}
// bitArray.kt
fun intersects(other: BitArray): Boolean {
val otherBits = other.bits
val start = min(bits.size, otherBits.size) - 1
for (i in start downTo 0) {
if ((bits[i] and otherBits[i]) != 0L) {
return true
}
}
return false
}
fun contains(other: BitArray): Boolean {
val otherBits = other.bits
// check if other BitArray is larger and if there is any of those bits set
for (i in bits.size until otherBits.size) {
if (otherBits[i] != 0L) {
return false
}
}
// check overlapping bits
val start = min(bits.size, otherBits.size) - 1
for (i in start downTo 0) {
if ((bits[i] and otherBits[i]) != otherBits[i]) {
return false
}
}
return true
}
/**
* Returns the logical size of the [BitArray] which is equal to the highest index of the
* bit that is set.
*
* Returns zero if the [BitArray] is empty.
*/
fun length(): Int {
for (word in bits.size - 1 downTo 0) {
val bitsAtWord = bits[word]
if (bitsAtWord != 0L) {
val w = word shl 6
for (bit in 63 downTo 0) {
if ((bitsAtWord and (1L shl bit)) != 0L) {
return w + bit + 1
}
}
}
}
return 0
}
/**
* Returns number of bits that are set in the [BitArray].
*/
fun numBits(): Int {
var sum = 0
for (word in bits.size - 1 downTo 0) {
val bitsAtWord = bits[word]
if (bitsAtWord != 0L) {
for (bit in 63 downTo 0) {
if ((bitsAtWord and (1L shl bit)) != 0L) {
sum++
}
}
}
}
return sum
}
inline fun forEachSetBit(action: (Int) -> Unit) {
// it is important that we go from right to left because
// otherwise some code in toEntityBag will fail with ensureCapacity.
for (word in bits.size - 1 downTo 0) {
val bitsAtWord = bits[word]
if (bitsAtWord != 0L) {
val w = word shl 6
for (bit in 63 downTo 0) {
if ((bitsAtWord and (1L shl bit)) != 0L) {
action(w + bit)
}
}
}
}
}
fun toEntityBag(bag: MutableEntityBag) {
var checkSize = true
bag.clear()
forEachSetBit { idx ->
if (checkSize) {
checkSize = false
// this is working because forEachSetBit goes
// from right to left, so idx is the highest index here
bag.ensureCapacity(idx)
}
bag += Entity(idx)
}
}
I had an idea of caching the result of the contains
call inside a family but that requires a fast hashCode
function of the BitArray
. The current version is pretty slow and the cache then slows down the performance :D At least in a quick&dirt try-out that I had.
There's probably a few places in BitArray that could be sped up. The bit counting method should probably use https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/count-one-bits.html , since that is more likely to use the JVM intrinsic when available or the popcnt
instruction in some way from native code. Most of the per-bit loops currently do 64 iterations regardless of how many bits are set, so for sparse BitArrays they may check their inner conditional far more times than necessary. I'm not especially familiar with idiomatic Kotlin style, but I'd suggest looking for existing high-performance BitSet/BitArray implementations in Kotlin, and seeing how they do it in the source code. The hashCode() question is an interesting one; contains()
seems to be implemented quite well already, so optimizing it could be a challenge (as you found).
I think I'll tinker with bitArray.kt some, and submit a PR if I see any performance gains.
This version contains some breaking changes which add some more flexibility to Fleks and also keep it up-to-date with other technologies. The most important change is the requirement of Java 11 instead of 8. This seems to be safe nowadays, even for Android. For LibGDX users, please use Gdx-Liftoff as your setup tool which already creates proper Java11 projects.
Changelog:
world
function got renamed toconfigureWorld
. I got several comments on my YT videos that people got confused betweenworld
andWorld
(constructor vs DSL function). To make it clear, it got now renamed toconfigureWorld
configureWorld
functions no longer matters. Before, it was mandatory to callsystems
last. Now you are free to callinjectables
,systems
,family
in any order.@Serializable
annotation is now part of the entity and makes it easier for certain serialization topics. This was suggested by the KorGE community.EntityProvider
interface got added which allows to provide your own entity ID logic. Per default it will still use the Fleks built-in recycle ID logic. However, you can now define anEntityProvider
inside theconfigureWorld
function to use your own implementation. This should improve the usability for network gaming.ComponentHook
s got replaced by normal component functions. TheComponent
interface now has anonAddComponent
andonRemoveComponent
function which can be overridden to add custom functionality. Both functions run in the context of aWorld
. This makes it easier to write such logic because you no longer need to add theComponentHook
to your world configuration. Thanks to @geist-2501 for the contribution!Entity.configure
function (@aSemy)For more information about the new functionality, please check out the updated wiki.
@jobe-m: as always, please let me know if everything is working in your KorGE example :)
@geist-2501: Shall we rename
onAddComponent
andonRemoveComponent
toonAdd
andonRemove
? Since they are functions of a component, I think theComponent
part of the name is redundant. What do you think? I can add the change to this PR.And btw, wiki is not yet updated ;) Will do that, when the version goes live.