Open sergeymorkovkin opened 1 year ago
Can you confirm the crashes still happen after #3852 – this crash was something we also notices during testing and #3852 is resolving that problem for us. We've actually been getting that crash much sooner than after 5 mins and with the increase number of circles it'd happen within few seconds, and after that patch it'd no longer occur.
Regarding BokehExample what needs to be mentioned is that it isn't really implemented optimally, it's never meant to be and in its current form it serves well as a benchmark. It is important for us to know what type of devices can run at what framerate. The most time consuming part in that example is due to the fact that circles manipulate layout properties. Because of that we need to recalculate layout on every single frame and in addition to that, this actually happen as many times as there are circles – reanimated isn't optimized for this type of scenario (yet?) and we don't do any batching of updates that require layouting (note that with other type of updates batching isn't really necessary)
The below diff does a relatively simple optimization that changes that example to use transforms instead of layout props:
diff --git a/FabricExample/src/BokehExample.tsx b/FabricExample/src/BokehExample.tsx
index b544f65cc..18d2385f0 100644
--- a/FabricExample/src/BokehExample.tsx
+++ b/FabricExample/src/BokehExample.tsx
@@ -51,19 +51,28 @@ function Circle() {
return () => clearInterval(id);
});
+ const size = 100 + power * 250;
+
const animatedStyle = useAnimatedStyle(() => {
- const size = 100 + power * 250;
return {
backgroundColor: `hsl(${hue.value}, 100%, 50%)`,
- width: size,
- height: size,
- top: top.value - size / 2,
- left: left.value - size / 2,
+ transform: [
+ { translateY: top.value - size / 2 },
+ { translateX: left.value - size / 2 },
+ ],
opacity: 0.1 + (1 - power) * 0.1,
};
}, []);
- return <Animated.View style={[styles.bokeh, animatedStyle]} />;
+ return (
+ <Animated.View
+ style={[
+ styles.bokeh,
+ { width: size, height: size, top: 0, left: 0 },
+ animatedStyle,
+ ]}
+ />
+ );
}
interface BokehProps {
@kmagiera Kindly find my replies below.
Can you confirm the crashes still happen after https://github.com/software-mansion/react-native-reanimated/pull/3852
Cannot confirm your patch is working. There is one thing regarding this. Looks like calling requestanimationFrame
doesn't call our new function and we need to call it like global.requestAnimationFrame
. Looks like monkey-patching requestAnimationFrame
doesn't fix all captured references to an original JSI-based requestAnimationFrame
. Have you checked this? I'll still try to make it work, though.
The below diff does a relatively simple optimization that changes that example to use transforms instead of layout props.
I've tried this here. Rendering performance is not an issue. Reanimated is. We should optimize better.
It is important for us to know what type of devices can run at what framerate.
I've got some information for OPPO A16. These are the measurements of how long updateProps executes on average (in ms) + FPS. As you can see, I've done 5 measurements in a row after a fresh restart for every build type.
NewArch + Fabric + Debug = 2.14 2.05 2.35 2.20 2.35 (9.0 FPS, 1..6)
NewArch + Fabric + Release = 0.49 0.35 0.43 0.43 0.44 (50+ FPS, 0..2)
Release is (2.14+2.05+2.35+2.20+2.35)/(0.49+0.35+0.43+0.43+0.44) = 5.19 times faster
OldArch + Debug = 2.72 2.75 2.28 2.33 2.34 (15.4 FPS, 1..5)
OldArch + Release = 0.48 0.43 0.40 0.42 0.41 (50 FPS, 0..2)
Release is (2.72+2.75+2.28+2.33+2.34)/(0.48+0.43+0.40+0.42+0.41) = 5.80 times faster
All these measurements were done with 25 elements updating translateX/Y
only.
Regarding the crash we've identified the actual root cause and the fix is here https://github.com/software-mansion/react-native-reanimated/pull/3859
As for monkey patching rAF I did check this but forgot to mention that it requires https://github.com/software-mansion/react-native-reanimated/pull/3838 to work that was merged recently and you perhaps don't have that patch included
Confirm it's no longer crashing. But batching requestAnimationFrame
didn't help to increase FPS.
@kmagiera Do you think batching JSI calls to updateProps could be efficient? Or maybe it's about data conversion?
I think we could batch all prop updates in JS and call JSI-based function _updateProps
once (analogously to the requestAnimationFrame
optimization). This would eliminate JSI overhead (between JS and C++).
Another idea for performance improvement on Android might be doing exactly the same but in the native code. When non-layout props like color
or opacity
are updated, we synchronously call mUIImplementation.synchronouslyUpdateViewOnUIThread
on Paper or fabricUIManager.synchronouslyUpdateViewOnUIThread
on Fabric, separately for each component. Note that each of this call originates from the C++ code so it uses JNI which might be the bottleneck.
Good news is that on Fabric we already apply all ShadowTree updates at once (done in #3369) in performOperations
which is called once per frame (thus layout is calculated also once per frame), and I think the same guarantee holds for Paper as well.
At this point I think it might consider adding SystraceSection
in our codebase so that we can use Android Profiler to see what's going on.
@tomekzaw Can we discuss this on a Zoom call, possibly?
I'm not sure about batching updateProps calls, this isn't as straightforward as with rAF calls, where we can save a number of JSI roundtrips. With updateProps we still need to pass the additional data so we'd have fewer calls but still more data to be unmarshalled using JSI. I wouldn't consider adding this unless we have a clear indication that it improves performance.
With updateProps we still need to pass the additional data so we'd have fewer calls but still more data to be unmarshalled using JSI.
Same concern here. Our gains might be insignificant, since marshaling will have to process roughly same amount of data. I'll try this anyways. Keep digging.
Hey @sergeymorkovkin, there's one more thing which @kmagiera has reminded me of. When @j-piasecki was profiling flash-list on Android, he noticed that the performance depends on the NDK version used. In particular, Hermes built from sources with NDK 24 is way slower than the one build using NDK 21. If you use RN 0.70 app, NDK 24 is used by default on M1 Macs, see here. I know this is not really related to poor performance of updateProps
but just wanted to let you know.
Edit: I added some systraces and here are the first results. I'm using a Samsung M23 (not really low-end), app in release build, using profileable instead of debuggable, 100% battery, not plugged in. Using BokehExample with 100 views with the transforms patch applied (i.e. no layout props). First, onAnimationFrame
takes ~24 ms which is unacceptable. Light green represents NodesManager#updateProps
, light blue below is mUIImplementation.synchronouslyUpdateViewOnUIThread
. We suspect that the empty room between each updateProps
is calling the useAnimatedStyle
worklet itself.
Here's my update on profiling Reanimated on Android. I added some more systraces in Java, C++ and JS code (as JSI binding). First, we noticed that one chunk of code related to color processing was left after shareable rewrite is no longer necessary, see #3872. Further ideas include optimizing JSI/JNI conversions and prop filtering (JNIHelper
, init
, forLoop
) and perhaps batching prop updates on JS side.
I added some more systraces in Java, C++ and JS code (as JSI binding)
@tomekzaw Can you please confirm you're purely using Java-based Systrace and exposing same to both C++ and JS?
Can you please confirm you're purely using Java-based Systrace and exposing same to both C++ and JS?
@sergeymorkovkin Here's how I added systraces:
import com.facebook.systrace.Systrace;
Systrace.beginSection(Systrace.TRACE_TAG_REACT_JAVA_BRIDGE, "performOperations");
// ...
Systrace.endSection(Systrace.TRACE_TAG_REACT_JAVA_BRIDGE);
First I've filled the implementation of DummySystraceSection
in node_modules/react-native/ReactCommon/react/renderer/debug/SystraceSection.h
:
#include <android/trace.h>
struct DummySystraceSection {
public:
template <typename... ConvertsToStringPiece>
explicit DummySystraceSection(
const char *name,
ConvertsToStringPiece &&...args) {
ATrace_beginSection(name);
}
~DummySystraceSection() {
ATrace_endSection();
}
};
using SystraceSection = DummySystraceSection;
Since ATrace_beginSection
and ATrace_endSection
were introduced in API 23, I also needed to bump minSdkVersion
to 23 (both in android/build.gradle
and Example/android/build.gradle
)
They are meant to be used in RAII-style:
{
SystraceSection s("JNIHelper::ConvertToPropsMap");
// ...
} // ~DummySystraceSection() is automatically called when the scope ends
In RuntimeDecorator::decorateUIRuntime
I've implemented two JSI bindings:
auto clb10 = [](
jsi::Runtime &rt,
const jsi::Value &thisValue,
const jsi::Value *args,
const size_t count) -> jsi::Value {
ATrace_beginSection(args[0].asString(rt).utf8(rt).c_str());
return jsi::Value::undefined();
};
jsi::Value fun10 = jsi::Function::createFromHostFunction(
rt, jsi::PropNameID::forAscii(rt, "_beginSection"), 1, clb10);
rt.global().setProperty(rt, "_beginSection", fun10);
auto clb11 = [](
jsi::Runtime &rt,
const jsi::Value &thisValue,
const jsi::Value *args,
const size_t count) -> jsi::Value {
ATrace_endSection();
return jsi::Value::undefined();
};
jsi::Value fun11 = jsi::Function::createFromHostFunction(
rt, jsi::PropNameID::forAscii(rt, "_endSection"), 0, clb11);
rt.global().setProperty(rt, "_endSection", fun11);
I've also needed to whitelist these two names in plugin.js
so these functions work properly in worklets:
const globals = new Set([
// ...
'_beginSection',
'_endSection',
]);
Because some of the worklets are also run on the main JS context, here's how to use them:
if (_WORKLET) _beginSection('UpdateProps processColor');
// ...
if (_WORKLET) _endSection();
I did some initial benchmarks and I think I have one idea how to optimize execution time of single _updateProps
call for non-layout-only props as well as a mix of layout and non-layout props without JS props.
I've pushed my changes to @tomekzaw/android-systrace
. It looks like 2b54074f1e5d8673f6ded19e9745981dfe4ca180 optimizes the execution time of a single styleUpdater
from ~250 μs down to ~150 μs. cc @sergeymorkovkin
I'm back with the numbers. TL;DR updateProps
is 1.5-2x faster on 2b54074f1e5d8673f6ded19e9745981dfe4ca180 (tested on Samsung M23 as described above, 200 circles in BokehExample)
Version | Test case | styleUpdater avg |
animationFrame avg |
styleUpdater speedup |
animationFrame speedup |
---|---|---|---|---|---|
before | only UI props | 270 μs | 61,97 ms | - | - |
after | only UI props | 147 μs | 34,68 ms | 1.84x | 1.79x |
before | UI & native props | 296 μs | 67,97 ms | - | |
after | UI & native props | 181 μs | 46,31 ms | 1.64x | 1.47x |
Note: I'm still using the old terminology for consistency with the code, "UI props" means non-layout props while "native props" are props that require layout recalculation when changed.
@tomekzaw I'll try to replicate these results on my device. Will also look into weird mappers behavior with react-navigation.
@sergeymorkovkin The number of requestAnimationFrame
calls is 3x the number of circles in BokehExample because there are three independent timing animations running. This is the expected behaviour.
const update = () => {
left.value = withTiming(left.value + getRandomPositionDiff(), config);
top.value = withTiming(top.value + getRandomPositionDiff(), config);
hue.value = withTiming(hue.value + getRandomHueDiff(), config);
};
When it comes to the performance of updateProps, one problem with using translateX
/translateY
over left
/top
that we found with @kmagiera is the way how React Native handles transforms on Android. First, React Native converts the list of transforms (in this case only transformX
and transformY
) into a transform matrix, which is then decomposed into single values of x-axis translation, y-axis translation, rotation etc. (see here). We are thinking of exposing additional props like translateX
which directly call view.setTranslationX
and avoid expensive calculations.
Finally, I couldn't find a place in our code where we explicitly invoke layout recalculation after we update layout props in performOperations
. It turns out that this is done in mUIImplementation.dispatchViewUpdates
(see here). We are calling this function conditionally because if React Native is currently processing a batch (i.e. queue isn't empty) then it will call dispatchViewUpdates
on its own and Reanimated shouldn't do it. However, if there is no batch being processed, we need to call dispatchViewUpdates
with an artificial batch id of -1.
@tomekzaw
You're totally right, I confirm your observation. The thing is in my other example I'm not using style-nested animations, and that's why I've missed it completely.
As for the transformation matrices, I suppose, those should normally be sent to the GPU, right? Why RN does this double calculation, is not very clear to me. It's clearly calling MatrixMathHelper.decomposeMatrix(). Looks like the bottleneck we were looking for.
As for calling dispatchViewUpdates(), I suppose, it's good, right?
I suppose, those should normally be sent to the GPU, right?
Well, currently React Native runs these matrix calculations on the CPU because it needs to call setTranslationX
, setScaleX
, setRotationX
etc. individually. I think that Android uses GPU at some later stage when drawing these views onto the screen. (I also found some Android API for composing transforms here but it's under android.view.animation
so not sure if RN could use it instead of setTranslationX
etc.)
Looks like the bottleneck we were looking for.
Yeah, synchronouslyUpdateViewOnUIThread
itself takes almost half of the execution time of the whole _updateProps
so this is totally worth optimizing. I think we can implement another variant of _updateUiPropsPaper
specifically for View
component that supports simple props like translateX
, scaleX
, rotateX
individually and directly, i.e. without transform composition. One thing to remember is that other third-party libraries (such as react-native-svg) might whitelist its own non-layout props (like fill
) so we still need to support those.
As for calling dispatchViewUpdates(), I suppose, it's good, right?
Yes, it's correct. The only problem with this chunk of code is that sometimes, in a specific scenario, it can deadlock, see #3879, but this is another story.
My another observation is that working with WritableMap's inside updateProps is also expensive. Possibly, this is why your approach of splitting it into three separate methods works. Let me try it...
Yeah, basically ReadableMap
and WritableMap
are implemented in C++ so each read or write is a separate JNI call. Also, it's expensive to allocate them, so that's exactly what I'm trying to avoid.
Edit: I think that the average execution time of _updateProps
also depends on the number of animated circles, so please keep this constant throughout the experiments. This might be also related to the slow warm-up issue you've mentioned on a call.
@tomekzaw Confirm that it works notably better with splitting updateProps on the JS side. It's animating 100 circles at roughly 45-50 FPS. We're close to resolving it. My further steps are:
Also, there is a reason they calculate transformation matrix. The thing is it's allowed to perform same transformation many times, e.g.:
transform: [ {translateX: 10}, {translateX: -5}, {translateX: 15}, ]
Confirm that it works notably better with splitting updateProps on the JS side. It's animating 100 circles at roughly 45-50 FPS. We're close to resolving it.
That's great! I will submit a PR with 2b54074f1e5d8673f6ded19e9745981dfe4ca180 then. These changes also need to be adapted for iOS but the speedup probably won't be this significant.
My further steps are:
- Bypass transform matrices where possible (I plan implementing this in a separate method)
I was actually thinking of optimizing transforms on my own but honestly I don't have a clear idea yet on how to handle this (should we introduce another group of props, like let's say "direct props"?) so feel free to share your ideas. For the reference, here's a complete list of all non-layout props. I noticed that the list already contains translateX
, translateY
, scaleX
and scaleY
so it might be worth checking to use these like regular props instead of nesting them inside translate
prop.
- Debug/improve redundant reactivity in mappers when used with react-navigation
- Implement batching updateProps on a per-frame basis
- Combine everything via a custom useAnimatedStyle() supporting multiple elements
These are all valid ideas for further research. In particular, it would be great if you could debug mappers when used with react-navigation or even come up with some test cases that we can use for unit tests for mappers.ts
.
The thing is it's allowed to perform same transformation many times
Yes also the order of transforms matters (i.e. translate
and then rotate
gives another transformation matrix than if rotation is first).
@tomekzaw You can surely submit a PR, but did you think about batching calls to updateProps? Now we need to batch 3 of them separately. I'd possibly wait. What do you think?
I was actually thinking of optimizing transforms on my own but honestly I don't have a clear idea yet on how to handle this
What we could do is allow passing an object for a transform prop:
transform: { translateX: 123, scaleY: 1.2, }
Of course, in such a case developer code will need to work around specifying multiple same-type transforms. Given this feature is more a convenience mechanism, I think this might be a fair compromise (to avoid matrix calculation). Still, the order of transforms is a concern...
@sergeymorkovkin After some time spent on profiling, at this point I no longer think that batching _updateProps
will speed things up significantly. Obviously, we could save some time on JSI calls but there would be more JNI calls. We know that 2b54074f1e5d8673f6ded19e9745981dfe4ca180 gives us ~1.6x speedup and we can save another ~40% on optimizing transforms. I think we can try out batching prop updates at some later stage.
Still, the order of transforms is a concern...
Yeah, first I would check if using transformX
and scaleX
props in top-level object works at least on Android.
@tomekzaw I've tested with direct props and we only get 10% of an improvement, not 40%, unfortunately. Looks, like all the CPU losses we experience are coming from JSI/JNI. And yes, I'm using NDK 21.4.7075529, not 24.
Using transform prop:
Using translateX/Y props:
At least, now we know how profitable avoiding matrix calculation is.
@tomekzaw Looks like our memory is leaking like crazy. Will look into this tomorrow. It's too early for a PR.
https://user-images.githubusercontent.com/1918742/208756897-efc24cb7-bdd6-4c9d-bc68-8bdf80c409c5.mov
Also, after observing an app for ~10 minutes, we start seeing stutters and frame drops again.
Looks like our memory is leaking like crazy.
Might be, but eventually memory usage substantially drops down. For the record, there were some memory leaks in the old implementation of shared values but after shareable rewrite they all should be gone. It all depends on when GC decides to kick in. Note that you can trigger GC maunally by calling gc();
function on both runtimes (although we don't recommend doing so on each frame). Also, by default Hermes only cleans up the last generation of objects. So theoretically the app can consume as much memory as possible until the OS starts to complain.
Also, after observing an app for ~10 minutes, we start seeing stutters and frame drops again.
There might be some problem with long-running animations (which I also observed on BokehExample left running while I was making tea) but @piaskowyk is doing his best fixing it.
@tomekzaw There is no clarity on what exactly is leaking (JS? Java?). GC is running every 10 seconds, but not clearing much. Likely, this is happening because of...
... by default Hermes only cleans up the last generation of objects.
App starts at around 72 Mb RAM and gradually reaches 512 Mb and even more. Yes, there is a significant GC drop in the end of this video I attached. The problem is we do not control this and this is not guaranteed.
As for stutters and frame drops after certain period, this might or might not be related to GC.
There might be some problem with long-running animations (which I also observed on BokehExample left running while I was making tea) but @piaskowyk is doing his best fixing it.
Can you please provide a reference? Is there an issue?
Can you please provide a reference? Is there an issue?
I don't have any code pointers or commits for you but @piaskowyk says that it might be related to the GC.
As for stutters and frame drops after certain period, this might or might not be related to GC.
Exactly. Can you please check if calling gc();
on both runtimes every 10 or 15 seconds resolves the issue with lagging long-running animations? We suspect these two issues might be related. We are also considering adding a system event listener that triggers GC on each memory warning (React Native already does it).
@piaskowyk @tomekzaw I think it'll be better to split this into a separate issue, so I created it here. I don't think this is related to GC, since triggering GC manually doesn't remove stutters. Also, if I trigger GC manually, it drops RAM consumption down to 300 Mb only, while application has started at around 70 Mb. Looks like we're not cleaning some lookup tables...
Turns out that translateX
and translateY
props (as well as scaleX
and scaleY
) are still supported on Android (although they don't work on iOS), here's the example code:
It looks like MatrixMathHelper.decomposeMatrix
is no longer called:
So as a workaround for your case, you can have two worklets (one returning an object with translateX
and translateY
props, for Android, and the other one with transform
property, for iOS) and pass one of them to useAnimatedStyle
depending on Platform.OS
.
A more general solution would be to add support for translateX
and translateY
props on iOS. Luckily, this is pretty straight-forward and shouldn't degrade the performance too much.
@tomekzaw I've made this test a couple of days ago. Kindly find my message below:
@tomekzaw I've tested with direct props and we only get 10% of an improvement, not 40%, unfortunately. Looks, like all the CPU losses we experience are coming from JSI/JNI. And yes, I'm using NDK 21.4.7075529, not 24.
Using transform prop:
Using translateX/Y props:
At least, now we know how profitable avoiding matrix calculation is.
@tomekzaw I'm wondering: do we really need to use ReadableMap/WritableMap and depend on JSI? Why not simply use JavaOnlyMap, for instance?
As for this solution you're offering, unfortunately, we cannot replace this in all places. E.g. we also use element rotation and anchor point shifting. So, there are multiple trabslateXY used inside a transform prop.
@sergeymorkovkin Here's the commit I talked about yesterday: a6a13678ec42cdc9af5b0cced8fc6970b9e94e35. It adds batching UI prop updates on the JS side and consequently optimizes the number of JNI calls thanks to using only a single ReadableMap for all component updates. Please check if it helps in your case (note that the patch is still a PoC so probably it needs some polishing before submitting a PR).
Thanks to batching, the average execution time of animationFrame
went down from 20,74 ms to 14,22 ms (both measurements taken on OPPO A16, release mode + profileable, 50 bokeh circles, using translateX
/translateY
instead of transform
, without JS systraces in useAnimatedStyle
etc.).
Before:
After:
When it comes to MapBuffer, I still don't think this is a viable alternative. Obviously, it saves some CPU time and memory as it uses jni::JByteBuffer::wrapBytes
to create a direct byte buffer without making a copy of the underlying data (see here). However, the JNI documentation states that:
It is therefore recommended that direct buffers be allocated primarily for large, long-lived buffers that are subject to the underlying system's native I/O operations.
In our case, the buffer is neither large nor long-lived (perhaps a better alternative would be to allocate a single DirectByteBuffer, re-use it between animation frames and use BSON instead). There are also some major limitations. First, MapBuffer only supports integer keys. React Native keeps a hard-coded list of constants that represent each property, see here. In our use-case, we need more flexibility. Finally, when we update props, we call methods from React Native that accepts ReadableMap
/WritableMap
, so even if we used MapBuffer, we would still need to convert it again to e.g. JavaOnlyMap
which also takes some time.
With the batching optimization, the conversion itself takes only about 15% of the execution time which is approximately two times faster than what we've started with. I've discussed it with @lukmccall (who is an Android & JNI expert) and honestly at this point we don't think it's worth trying to optimize it further, especially taking into account that the other 50% of the execution time originates from the JS part of Reanimated (useAnimatedStyle updater, valueSetter step) and most likely this is what we should focus on in the first place. We also observed an increased activity of GC due to a high number of allocations and deallocations which also needs to be taken care of.
My next step is to profile the JS part of Reanimated using JSC or Hermes profiler. I really hope to find something interesting there that can be optimized.
Here are the results of profiling Reanimated runtime using Hermes profiler via Chrome DevTools. Here are the results are from OPPO A16 running Example app in debug mode:
Thanks to this, I was able to identify the following bottlenecks:
_updatePropsPaper
– as mentioned above, at this point I don't think this can be easily optimized further, so let's skip it as for now and focus on the JS-only codeisAnimated
– this simple utility function takes almost 7% of the execution time, most likely because it involves array operations, this definitely needs to be investigatedGC Young Gen
– it looks like we perform a lot of allocations and deallocations in a single frame, we need to find some way to re-use existing instances of shareablesarrayPrototypeIndexOf
– in some places we use Array.indexOf
where key in obj
would also suffice, optimized in 6ce102d8bf839cdfdc51bd925fa99e53707f6275 which leads to ~5% speedup in debug mode (98 ms vs. 93 ms on average), however it shows no real improvement in release modestyleDiff
– still not sure what's the purpose of calculating difference_makeShareableClone
and _updateDataSynchronously
– these two calls can be avoided by using one-way-reads-only shared values. However, I was strongly discouraged from using one-way-reads-only shared value at this stage since they have not been thoroughly tested yet as well as they are a subject to change in near future due to different ideas for optimizations (that's why the flag is not a part of the public API yet)@tomekzaw styleDiff is required to avoid calling _updateProps when there are no actual changes.
Hey @sergeymorkovkin, just out of curiosity I've implemented another optimization, see a13f33dd6e3dea8db11aef856db5a306d47d32d2. This time, along with batching prop updates and calling _updateProps
only once, I'm also flattening the whole map of updated props. Previously, there was a separate ReadableMap of updated props for each component, for instance:
{
"3": {
"translateX": 100,
"translateY": 200,
},
"5": {
"translateX": 300,
"translateY": 400,
},
}
The optimization is to have just one map where the keys consist of the view tag and the prop name (like 3_translateX
).
{
"3_translateX": 100,
"3_translateY": 200,
"5_translateX": 300,
"5_translateY": 400,
}
This significantly optimizes the number of JNI calls since previously there were 2 calls (importKeys
and importValues
called within getLocalMap
in ReadableNativeMap.java
) per each nested ReadableMap (this makes 2n+2) and after the change there's just 2 (regardless of the number of animated components).
Note that currently the PoC implementation only supports values of double type (adding support for integers or strings should be easy) and also doesn't optimize for nested fields (like transform
) since we flatten only one level of the map (although I'm thinking on how to generalize this idea).
The results are quite promising, see the attached table. Also, the more animated components, the better speedup. | Implementation | Number of circles | onAnimationFrame avg time |
---|---|---|---|
Batching | 50 | 13,7 ms | |
Batching + flattening | 50 | 12,8 ms (~6.5% faster) | |
Batching | 100 | 26,2 ms | |
Batching + flattening | 100 | 24,3 ms (~7.3% faster) |
Before:
After:
Any update for this ?
@bolan9999 Yep, first we need to finish #4503 which adds JSI-based batching and then we'll be able to go deeper with JNI-level batching to make Android faster.
@bolan9999 Yep, first we need to finish #4503 which adds JSI-based batching and then we'll be able to go deeper with JNI-level batching to make Android faster.
Do you have a Benchmarks for this update totally? In my scene,I'm trying to rewrite a ScrollView and Recyclable list with Reanimated and GestureHandler. But when everything is done, the Benchmarks of it is a little bad. What's worse more is frame drops randomly on Android when scrolling 3 min or longer. But it works well when I stoped scrolling and wait for a few seconds.
@bolan9999 Do you memoize gestures?
@tomekzaw What do you mean memoize gestures
?
it is experimental. And I will open source and take some Benchmarks latter. But muti-items animations with Reanimated causes frame drops is real. The only thing I can do is reducing the count of items and worklet calls.
@bolan9999 If you use Gesture Handler v2 API (GestureDetector
), you can memoize gestures by wrapping the Gesture
objects inside useMemo
hooks, e.g.:
const tap = useMemo(() => {
return Gesture.Tap.onStart(/* ... */);
}, []);
<GestureDetector gesture={tap}>
@tomekzaw It is not re-rendered in my scene at all. And I don't use GestureDetector
cause of runOnJS
will crash in worklets. I use useAnimatedGestureHandler
instead.
I'm pretty sure worklet calls or transform
performance cause frame drops .
Getting the same issue animating transform
using withRepeat
, frame drops to as low as 10 fps
on UI and JS side on Android (Pixel 4a), but runs at buttery smooth 60 fps
on iOS.
const itemSize = 560
const offset = itemSize / 2
const confettiTranslateY = useSharedValue(0);
const imageTranslateY = useSharedValue(0);
...
useEffect(() => {
confettiTranslateY.value = withRepeat(
withSpring(-50, {
damping: 8,
}),
-1,
true,
);
imageTranslateY.value = withRepeat(
withTiming(-10 * itemSize + offset, {
duration: 10_000,
easing: Easing.inOut(Easing.exp),
}),
-1,
true,
);
}, [])
...
const confettiStyle = useAnimatedStyle(
() => ({ transform: [{ translateY: confettiTranslateY.value }] }),
[confettiTranslateY],
);
const imageStyle = useAnimatedStyle(
() => ({ transform: [{ translateY: imageTranslateY.value }] }),
[imageTranslateY],
);
...
I'm experiencing the same issue. Getting 60 fps on all iOS devices (see this post), but it's constant 16 fps on both JS and UI thread on my low end Android device (Samsung A11) even when there are no re-renders and nothing happening on the JS thread.
Description
On the lower-end Android devices (worth $100-150) Reanimated 2/3 can only handle maximum 30 simultaneous elements per frame. Animating more than that results in serious frame drops. In my tests I was using OPPO A16.
It's been tested with or without new architecture and Fabric. It's been tested with or without @3.0.0-rc8 shared variables performance update. It's been tested with or without requestAnimationFrame optimization.
I'm still researching this issue and it boils down to a few problems:
Inside JSI runtime the requestAnimationFrame function calls are done synchronously via JSI, while every useAnimatedStyle instance does it's own call to rAF. It's been addressed by @kmagiera here.
Every updateProps call is too heavy and takes ~2ms in debug build and ~0.3-0.5ms in release build (fabric/paper respectively). The larger generated style object - the longer it takes, since JSI does argument conversion on a CPU.
THIS IS AN ASSUMPTION. I've noticed react-navigation mappers somehow get involved into a user-defined animation. Need more time to dig into this.
VIDEO CRASHES.md
Steps to reproduce
Actual result: FPS is somewhere between 30-40 FPS for 100 elements and app crashes quickly.
Expected result: should be able to animate 100-150 elements at 60 FPS with no crashes.
Snack or a link to a repository
https://github.com/sergeymorkovkin/test-reanimated
Reanimated version
3.0.0-rc8
React Native version
0.70.6
Platforms
Android
JavaScript runtime
Hermes
Workflow
React Native (without Expo)
Architecture
None
Build type
Release mode
Device
Real device
Device model
OPPO A16
Acknowledgements
Yes