ARK-Builders / ARK-Navigator

Android app for navigation through your data
MIT License
15 stars 15 forks source link

#179: Create Rust lib crc32 compute_id #180

Closed sisco0 closed 2 years ago

sisco0 commented 2 years ago

Solves https://github.com/ARK-Builders/ARK-Navigator/issues/179 through Gitcoin.

A Rust-based solution has been developed to perform the computeId function as described in app/src/main/java/space/taran/arknavigator/mvp/model/repo/index/ResourceId.kt . This Rust library defined function (compute_id following Rust naming conventions) takes into account the same assertions and print the same messages as the Kotlin previous one.

Added to the aforementioned, the library build and testing processes have been added as a Github Actions workflow by using the latest Rust stable release. This workflow has been tested locally by using Act but cache functionality should be tested online.

Please, could you check the functionality is implemented as expected?

sisco0 commented 2 years ago

This PR should not be merged yet, as there is the requirement of going further with the integration and using JNI as stated in Rust Docs. I had an issue where the library is not found in the java library paths for some reason, so I would try to clarify this and hopefully, we could have the native fast .so implementation up and running.

kirillt commented 2 years ago

Thank you @sisco0! Yes, we need integration into the app itself, but this looks like a good start. I'll wait till it's embedded into APK and runs together and will review the code afterwards.

mdrlzy commented 2 years ago

Maybe this article will be useful.

sisco0 commented 2 years ago

Thanks for the base article @mdrlzy . The solution was updated for the current native function being integrated into the application. Also added the required dependencies under the Github Actions workflow.

Instrumented test ResourceMetaInstrumentedTest.kt was added as well, where you could run the crc32_iscorrect function for our purpose. Could you review the proposal @kirillt?

https://github.com/sisco0/ARK-Navigator/actions/runs/1674628493

kirillt commented 2 years ago

@sisco0 I ran complete reindexing of one of my huge folders (6181 images) using main version and using version from this PR, both 2 times. It revealed some performance degradation:

(before)
D/resources-index: resources metadata retrieved in 12496 milliseconds
...
D/resources-index: resources metadata retrieved in 9593 milliseconds
(now)
D/resources-index: resources metadata retrieved in 16812 milliseconds
...
D/resources-index: resources metadata retrieved in 13434 milliseconds

I guess, this is related to this line https://github.com/ARK-Builders/ARK-Navigator/pull/180/files#diff-6c1c14453a44202172b4eb08a1300781f34f4db2406cc83288cecf2c7511cc1eR20 1) loading of the library is (accidentally?) done twice; 2) if possible to load the lib once during the app startup, or lazily on first call to computeId, that should be done in order to achieve optimal performance.

Another minor issue is that NDK installation for local build should be either well documented or automated (maybe using Docker?), so the rest of the team would be capable of continuing development without dealing with Rust part. If this is not quick, then just documenting is fine.

Couple of other comments may come tomorrow regarding code itself, but the work is certainly going in right direction.

sisco0 commented 2 years ago

After some profiling the next results have been obtained. First, the application was ported to terminal as a normal Rust application to know about Debug and Release timings, with a lower bound of 0.07s for the Release configuration and 2.78s for Debug configuration, taking into account a ~880MB file.

❯ time target/release/cool
calculating hash of assets/remnux-v7-focal-virtualbox.ova.crdownload (size is 882 megabytes)
925253378 bytes has been read
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `2925612855`,
 right: `454892935`', src/lib.rs:50:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
target/release/cool  0.07s user 0.10s system 99% cpu 0.174 total
❯ time target/debug/cool
calculating hash of assets/remnux-v7-focal-virtualbox.ova.crdownload (size is 882 megabytes)
925253378 bytes has been read
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `2925612855`,
 right: `454892935`', src/lib.rs:50:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
target/debug/cool  2.78s user 0.10s system 99% cpu 2.882 total

After that, we got into the Android Studio and ran profiling of the App while recording CPU traces adding a folder with the same file, the next screenshot was taken showing 3.46s timing. android_profiling

The next step would be trying to go on with Callgrind somehow to understand which are native source code lines that are taking more time to execute under the Android JVM.

kirillt commented 2 years ago

Btw 0.7 sec in Release vs 2.8 sec in Debug is amazing. I think it's good idea to move arklib into separate repo, so Kotlin project would just pull Release build as dependency even when building the app in Debug. I also want to have CLI tool for miscellaneous tasks with arklib. These issues can be reserved for you, I would set up new contract for them.

sisco0 commented 2 years ago

Btw 0.7 sec in Release vs 2.8 sec in Debug is amazing. I think it's good idea to move arklib into separate repo, so Kotlin project would just pull Release build as dependency even when building the app in Debug. I also want to have CLI tool for miscellaneous tasks with arklib. These issues can be reserved for you, I would set up new contract for them.

Please, take into account that 0.07s is the timing for executing the Rust CRC32 function under the Desktop x86_64 Linux Environment where the library is built as release; Meanwhile, 3.46s (which approximately ~1.7s are idle time because of the JVM Task Scheduler) is the timing for executing the Rust CRC32 function under the JVM Android project where the library is built as release and the Android Project is built as debug. Both tests have been executed with a folder that only contains one ~880MB file. I would like to know if the behaviour is different when the Android project is built as a Signed Release, as I could not test that because of signing configurations.

That is the reason why I specify that we should have to profile in a deeper way the shared library functions that are executed so we could know about what bottlenecks we are having. We could move on to a newer NDK version and try to fix the code for that if required only when we are able to compile the project under the Release configuration (the library is already compiled under Release configuration but the Android App is not), and optionally a solid native shared library profiling setup is had by the use of SimplePerf.

Also note that we are not managing exceptions under the Rust library in a way that the application does not crash if any of these occur at the native (Rust code related) side.

kirillt commented 2 years ago

I would like to know if the behaviour is different when the Android project is built as a Signed Release, as I could not test that because of signing configurations.

To be honest, I've no idea about Signed Releases. Maybe @mdrlzy might be able to comment. I guess we haven't worked on it yet, since we aren't distributing via Play store yet. For now, it's alright to track plain APK performance.

Please, take into account that 0.07s is the timing for executing the Rust CRC32 function under the Desktop x86_64 Linux Environment where the library is built as release; Meanwhile, 3.46s (which approximately ~1.7s are idle time because of the JVM Task Scheduler) is the timing for executing the Rust CRC32 function under the JVM Android project where the library is built as release and the Android Project is built as debug.

Oh, I see. We will have chance to tune performance both on desktop and mobile. Profiling is always good.

kirillt commented 2 years ago

This is not run under the CI as instrumented tests are required, we could work on running Instrumented Tests remotely under the CI but we should ensure that our environment uses an approach such as the one in https://github.com/marketplace/actions/android-emulator-runner. Hopefully, we could a new workflow for this or modify our already present one. What do you think?

I got it, let's add such a workflow in next PR related to arklib.

sisco0 commented 2 years ago

@kirillt stated: To be honest, I've no idea about Signed Releases. Maybe @mdrlzy might be able to comment. I guess we haven't worked on it yet, since we aren't distributing via Play store yet. For now, it's alright to track plain APK performance.

What I mean with my statement about the benchmarks is that we should be able to compile our Android Application under a Release Configuration, as we are currently building it under a Debug configuration by default (The native library is built under a Release configuration). That process should be done by generating some PGP keys that could be set for each developer so he could sign the generated bundle and create a Release version of the application for measuring times, which we hope that we could obtain better performance of the Rust-based native library implementation under the Android Virtual Device (AVD). Summarizing in the next table:

Tested configuration Configuration to test
App build config Debug Release
Rust-based native library build config Release Release

@sisco0 stated: This is not run under the CI as instrumented tests are required, we could work on running Instrumented Tests remotely under the CI but we should ensure that our environment uses an approach such as the one in https://github.com/marketplace/actions/android-emulator-runner. Hopefully, we could a new workflow for this or modify our already present one. What do you think?

@kirillt stated: I got it, let's add such a workflow in next PR related to arklib.

We should open a new issue for this enhancement that contains the given URL as reference. Take into account that adding these instrumentation tests and macOS-based runners could induce to changes in the .yml Github Actions commands.

kirillt commented 2 years ago

Understood, thank you for the explanation.