Quillraven / Fleks

Fast, lightweight, multi-platform entity component system in Kotlin
MIT License
180 stars 20 forks source link

Version 2.4 #106

Closed Quillraven closed 1 year ago

Quillraven commented 1 year ago

This version contains some breaking changes which add some more flexibility to Fleks and also keep it up-to-date with other technologies. The most important change is the requirement of Java 11 instead of 8. This seems to be safe nowadays, even for Android. For LibGDX users, please use Gdx-Liftoff as your setup tool which already creates proper Java11 projects.

Changelog:

For more information about the new functionality, please check out the updated wiki.


@jobe-m: as always, please let me know if everything is working in your KorGE example :)

@geist-2501: Shall we rename onAddComponent and onRemoveComponent to onAdd and onRemove? Since they are functions of a component, I think the Component part of the name is redundant. What do you think? I can add the change to this PR.

And btw, wiki is not yet updated ;) Will do that, when the version goes live.

Quillraven commented 1 year ago

Btw the publish workflow seems to get triggered even for PRs. Need to have a look at that but I think I already found a solution for it.

geist-2501 commented 1 year ago

@Quillraven yeah that sounds like a sensible rename!

jobe-m commented 1 year ago

@Quillraven Yes, I will. :)

Quillraven commented 1 year ago

@aSemy : any idea why "Publish" workflow gets triggered for PRs? I thought the if condition in the job should prevent that? Obviously, I am wrong :D

aSemy commented 1 year ago

@aSemy : any idea why "Publish" workflow gets triggered for PRs? I thought the if condition in the job should prevent that? Obviously, I am wrong :D

Ooh that's not very good! I don't know why that might be, I thought the condition would work too! I'll take a look...

EDIT: see #108

jobe-m commented 1 year ago

Ok, from my side. The new onAdd and onRemove component functions work for me :)

Currently in KorGE-Fleks I cannot use the data objects from Kotlin 1.9 since kproject sets up the KorGE project with 1.8.x. But that should not be a problem for Fleks. We just need to update KorGE'S kproject to support also 1.9.

Quillraven commented 1 year ago

Ok, from my side. The new onAdd and onRemove component functions work for me :)

Currently in KorGE-Fleks I cannot use the data objects from Kotlin 1.9 since kproject sets up the KorGE project with 1.8.x. But that should not be a problem for Fleks. We just need to update KorGE'S kproject to support also 1.9.

Great, thanks for checking! It is interesting to me, that you have problems with data objects. I will check that myself on the weekend in one of my projects. If this is not downwards compatible, shall we stick with 1.8.xx version for now?

jobe-m commented 1 year ago

data object was lately inroduced with Kotlin 1.9. Shouldn't be a big deal to update KorGE to 1.9. Soywiz is usually adapting new Kotlin versions quickly in KorGE.

But would be nice to be backwards compatible with Kotlin 1.8.xx just for some time. Maybe you can do a branch 2.4-kotlin1.8.xx after 2.4 is released which does not use data object. The rest you do not need to change since in KorGE-Fleks we use local gradle config to build Fleks.

Quillraven commented 1 year ago

@jobe-m: What kind of error are you actually getting? I just tried it in a project with Kotlin 1.8.22 and I can run it just fine.

jobe-m commented 1 year ago

I am getting The feature "data objects" is only available since language version 1.9. Are you sure that you build Fleks also with Kotlin 1.8 or did you only build the project with 1.8 but Fleks is build with 1.9?

Quillraven commented 1 year ago

I am getting The feature "data objects" is only available since language version 1.9. Are you sure that you build Fleks also with Kotlin 1.8 or did you only build the project with 1.8 but Fleks is build with 1.9?

I don't built Fleks at all. I just add it as a dependency to my projects. I use 2.4-SNAPSHOT from maven.

Building Fleks of course won't work with 1.8 but why do you have to build it?

jobe-m commented 1 year ago

With Soywiz's kproject we can add external modules as source delivery to a KorGE project. The benefit is that Fleks can be used on all fancy (homebrew) platforms which KorGE is available for. Thus Fleks is build with the build configuration of KorGE.

jobe-m commented 1 year ago

Please don't wait with 2.4 for me. I am currently unfortunately heavily loaded with private family stuff while trying also to survive at business. I won't have much time left the next month's. So my progress will be rather slow.

Quillraven commented 1 year ago

Ah understood - didn't know about the KorGE setup but then it makes sense. In this case, as you suggested, I will make a 2.4_8 branch with Kotlin 1.8 after the release.

And I totally understand the family and work stuff - I have similar problems at the moment ;)

Anyway, thanks for the info and updates! I will then release 2.4 this weekend.

tommyettinger commented 1 year ago

Thanks for recommending gdx-liftoff! Would you like Fleks added to the third-party extensions in Liftoff? I don't think you need to do anything.

I decided to run the benchmark with lower entity count and update count (each 1/10 of what it is now in the repo), and I got somewhat different results. Not having "fractionally complete" results (less than 1.0 throughput) for the complex benchmark helps both artemis and, at least a little, ashley. I ran this with Java 20 on a slightly newer laptop processor than the one used for the listed benchmarks, and Fleks' add and remove performance on here is vastly better than any competitor (225% more throughput than artemis-odb); I don't know why exactly. It's a tiny bit slower on both simple and complex with the smaller entity count, which suggests there might be an algorithmic complexity difference that makes fleks relatively faster on larger entity collections. Artemis is quite a bit faster on the complex benchmark with the smaller entity count (68% more throughput than fleks).

$ C:\d\jvm\jdk20-hotspot\bin\java -jar Fleks-jvmBenchmarks-jmh-2.5-SNAPSHOT-JMH.jar -w 10 -r 10 -wi 10 -i 10 -t 1 -f 1 -gc true
...
Benchmark                    Mode  Cnt      Score     Error  Units
ArtemisBenchmark.addRemove  thrpt   10   4071.356 ┬▒ 337.725  ops/s
ArtemisBenchmark.complex    thrpt   10    178.795 ┬▒   3.415  ops/s
ArtemisBenchmark.simple     thrpt   10   9047.583 ┬▒  62.516  ops/s
AshleyBenchmark.addRemove   thrpt   10   2269.431 ┬▒  42.238  ops/s
AshleyBenchmark.complex     thrpt   10     16.279 ┬▒   0.199  ops/s
AshleyBenchmark.simple      thrpt   10   2086.183 ┬▒  68.274  ops/s
FleksBenchmark.addRemove    thrpt   10  13264.218 ┬▒  34.791  ops/s
FleksBenchmark.complex      thrpt   10    106.155 ┬▒   2.879  ops/s
FleksBenchmark.simple       thrpt   10   7433.124 ┬▒  64.739  ops/s

This only shows that Artemis is faster than fleks for the simple and complex cases with 100 entities; it isn't ever faster on addRemove that I can see. Whatever you're doing for addRemove is great, and if there's a way to do something similar with what simple and complex use, they would probably outpace Artemis in more cases.

I initially ran the benchmark because I wanted to make sure the methodology was sound, but the benchmark code looks solid. Benchmarking is such a tricky thing to get right because there are so many factors involved...

EDIT: I also ran the benchmarks with Shenandoah GC enabled, to see if garbage collection might affect things.

$ C:\d\jvm\jdk20-hotspot\bin\java -XX:+UseShenandoahGC -jar Fleks-jvmBenchmarks-jmh-2.5-SNAPSHOT-JMH.jar -w 10 -r 10 -wi 10 -i 10 -t 1 -f 1 -gc true
...
Benchmark                    Mode  Cnt      Score     Error  Units
ArtemisBenchmark.addRemove  thrpt   10   3660.630 ┬▒ 596.098  ops/s
ArtemisBenchmark.complex    thrpt   10    124.183 ┬▒   1.215  ops/s
ArtemisBenchmark.simple     thrpt   10   8220.601 ┬▒ 240.047  ops/s
AshleyBenchmark.addRemove   thrpt   10   1770.910 ┬▒   8.321  ops/s
AshleyBenchmark.complex     thrpt   10     14.689 ┬▒   0.122  ops/s
AshleyBenchmark.simple      thrpt   10   2154.914 ┬▒  61.261  ops/s
FleksBenchmark.addRemove    thrpt   10  11708.724 ┬▒ 107.808  ops/s
FleksBenchmark.complex      thrpt   10    101.074 ┬▒   0.912  ops/s
FleksBenchmark.simple       thrpt   10   6607.135 ┬▒  48.420  ops/s

Shenandoah GC seems to hurt performance here, which surprised me.

Quillraven commented 1 year ago

@tommyettinger: funny coincidence because I wanted to open an issue soon for gdx-liftoff and ask, if you could add this library. To answer your first question: yes please, would be great if it can be added directly :)

Regarding performance: I am personally confused by the performance measurement of JVM languages. On my notebook (AMD processor) Fleks is always faster in addRemove and simple benchmark and in the complex one it is close. On my PC (old Intel CPU), which was used for the benchmarks in the README.md Fleks is still faster with addRemove but in simple benchmark it is random. Sometimes faster, sometimes slower.

I don't get it 100% but in the end I didn't care that much to investigate and understand it. Important for me was, that it is faster than Ashley and relatively close to Artemis.

To give a little bit of insight:

tommyettinger commented 1 year ago

I can run some more benchmarks later tonight, including ones that take hours if needed. I think the random-seeming aspect of the benchmarks has a lot to do with the OS, any existing processes running on the machine, how much time the JIT compiler has had to get things ready, etc. It's best to reduce the amount of error by running more iterations and for longer (I used 10 seconds, ideally I think I would try a minute per iteration when using many entities). I suspect GC may be a factor here because switching to Shenandoah had a (negative) performance effect, though it seemed to hurt all the benchmarks.

BitArrays are amazing data structures and I love them. 😄 If you aren't currently using bitwise operations to do steps in bulk, that would be something to consider, though I imagine you probably are doing things like masking heavily already. I suspect the conversion to a bag would be the main slow point there, or any GC that needs to be done to clean up temporary bags or bit arrays. It might be worth it to consider iterating while still in a BitArray -- if each block of bits is stored as a Long, at least on the JVM there are some very fast intrinsic candidates to work with that Long and its bits. If you want to get the lowest bit set in num, that doesn't even need a method call; it's just num and -num (I'm not 100% sure of the Kotlin syntax here, it's a bitwise AND operator).

Quillraven commented 1 year ago

Would be interesting to find out the code segments that take up the most time in simple/complex benchmarks and go from there. Not sure if your way of benchmarking can do that. The latest IntelliJ version contains benchmark inlay infos but I don't have experience with them yet. But might be useful as well ;)

My assumption is that following methods are the bottlenecks (family.kt and bitArray.kt):

/**
     * Returns true if the specified [compMask] matches the family's component configuration.
     *
     * @param compMask the component configuration of an [entity][Entity].
     */
    internal operator fun contains(compMask: BitArray): Boolean {
        return (allOf == null || compMask.contains(allOf))
            && (noneOf == null || !compMask.intersects(noneOf))
            && (anyOf == null || compMask.intersects(anyOf))
    }
    /**
     * Updates this family if needed and runs the given [action] for all [entities][Entity].
     *
     * **Important note**: There is a potential risk when iterating over entities and one of those entities
     * gets removed. Removing the entity immediately and cleaning up its components could
     * cause problems because if you access a component which is mandatory for the family, you will get
     * a FleksNoSuchComponentException. To avoid that you could check if an entity really has the component
     * before accessing it but that is redundant in context of a family.
     *
     * To avoid these kinds of issues, entity removals are delayed until the end of the iteration. This also means
     * that a removed entity of this family will still be part of the [action] for the current iteration.
     */
    inline fun forEach(crossinline action: Family.(Entity) -> Unit) {
        // Access entities before 'forEach' call to properly update them.
        // Check mutableEntities getter for more details.
        val entitiesForIteration = mutableEntities

        if (!entityService.delayRemoval) {
            entityService.delayRemoval = true
            isIterating = true
            entitiesForIteration.forEach { action(it) }
            isIterating = false
            entityService.cleanupDelays()
        } else {
            isIterating = true
            entitiesForIteration.forEach { this.action(it) }
            isIterating = false
        }
    }

// --> which also includes a call to

// This bag is added in addition to the BitArray for better iteration performance.
  @PublishedApi
  internal val mutableEntities = MutableEntityBag()
      get() {
          if (isDirty && !isIterating) {
              // no iteration in process -> update entities if necessary
              isDirty = false
              entityBits.toEntityBag(field)
          }
          return field
      }
// bitArray.kt

 fun intersects(other: BitArray): Boolean {
        val otherBits = other.bits
        val start = min(bits.size, otherBits.size) - 1
        for (i in start downTo 0) {
            if ((bits[i] and otherBits[i]) != 0L) {
                return true
            }
        }
        return false
    }

    fun contains(other: BitArray): Boolean {
        val otherBits = other.bits

        // check if other BitArray is larger and if there is any of those bits set
        for (i in bits.size until otherBits.size) {
            if (otherBits[i] != 0L) {
                return false
            }
        }

        // check overlapping bits
        val start = min(bits.size, otherBits.size) - 1
        for (i in start downTo 0) {
            if ((bits[i] and otherBits[i]) != otherBits[i]) {
                return false
            }
        }

        return true
    }

    /**
     * Returns the logical size of the [BitArray] which is equal to the highest index of the
     * bit that is set.
     *
     * Returns zero if the [BitArray] is empty.
     */
    fun length(): Int {
        for (word in bits.size - 1 downTo 0) {
            val bitsAtWord = bits[word]
            if (bitsAtWord != 0L) {
                val w = word shl 6

                for (bit in 63 downTo 0) {
                    if ((bitsAtWord and (1L shl bit)) != 0L) {
                        return w + bit + 1
                    }
                }
            }
        }
        return 0
    }

    /**
     * Returns number of bits that are set in the [BitArray].
     */
    fun numBits(): Int {
        var sum = 0
        for (word in bits.size - 1 downTo 0) {
            val bitsAtWord = bits[word]
            if (bitsAtWord != 0L) {
                for (bit in 63 downTo 0) {
                    if ((bitsAtWord and (1L shl bit)) != 0L) {
                        sum++
                    }
                }
            }
        }
        return sum
    }

    inline fun forEachSetBit(action: (Int) -> Unit) {
        // it is important that we go from right to left because
        // otherwise some code in toEntityBag will fail with ensureCapacity.
        for (word in bits.size - 1 downTo 0) {
            val bitsAtWord = bits[word]
            if (bitsAtWord != 0L) {
                val w = word shl 6

                for (bit in 63 downTo 0) {
                    if ((bitsAtWord and (1L shl bit)) != 0L) {
                        action(w + bit)
                    }
                }
            }
        }
    }

    fun toEntityBag(bag: MutableEntityBag) {
        var checkSize = true
        bag.clear()
        forEachSetBit { idx ->
            if (checkSize) {
                checkSize = false
                // this is working because forEachSetBit goes
                // from right to left, so idx is the highest index here
                bag.ensureCapacity(idx)
            }
            bag += Entity(idx)
        }
    }

I had an idea of caching the result of the contains call inside a family but that requires a fast hashCode function of the BitArray. The current version is pretty slow and the cache then slows down the performance :D At least in a quick&dirt try-out that I had.

tommyettinger commented 1 year ago

There's probably a few places in BitArray that could be sped up. The bit counting method should probably use https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/count-one-bits.html , since that is more likely to use the JVM intrinsic when available or the popcnt instruction in some way from native code. Most of the per-bit loops currently do 64 iterations regardless of how many bits are set, so for sparse BitArrays they may check their inner conditional far more times than necessary. I'm not especially familiar with idiomatic Kotlin style, but I'd suggest looking for existing high-performance BitSet/BitArray implementations in Kotlin, and seeing how they do it in the source code. The hashCode() question is an interesting one; contains() seems to be implemented quite well already, so optimizing it could be a challenge (as you found).

I think I'll tinker with bitArray.kt some, and submit a PR if I see any performance gains.