facebookresearch / playtorch

PlayTorch is a framework for rapidly creating mobile AI experiences.
https://playtorch.dev/
MIT License
829 stars 103 forks source link

App crashes with memory leak when running for a long time #158

Open SomaKishimoto opened 2 years ago

SomaKishimoto commented 2 years ago

Version

0.2.2

Problem Area

react-native-pytorch-core (core package)

Steps to Reproduce

Following the steps below, the memory heap increases at a pace of about 100MB per hour. We are developing an application that needs to run for long periods of time. Could you please tell me how to avoid this issue.

$ npx react-native init AwesomeTSProject --template react-native-template-typescript
$ cd AwesomeTSProject
$ yarn add react-native-pytorch-core
$ npx pod-install
import * as React from 'react';
import {useCallback} from 'react';
import {LayoutRectangle, StyleSheet} from 'react-native';
import {
  Camera,
  CameraFacing,
  Canvas,
  CanvasRenderingContext2D,
  Image,
  ImageUtil,
} from 'react-native-pytorch-core';

const App = () => {
  const contextRef = React.useRef<CanvasRenderingContext2D>();
  const [layout, setLayout] = React.useState<LayoutRectangle>();

  const handleCapture = useCallback(
    async (image: Image) => {
      const context = contextRef.current;
      if (context != null && layout != null) {
        context.clear();
        const imageWidth = image.getWidth();
        const imageHeight = image.getHeight();
        const scale = Math.min(
          layout.width / imageWidth,
          layout.height / imageHeight,
        );
        context.drawImage(image, 0, 0, imageWidth * scale, imageHeight * scale);
        const wholeImageData = await context.getImageData(
          0,
          0,
          imageWidth * scale,
          imageHeight * scale,
        );
        const wholeImage = await ImageUtil.fromImageData(wholeImageData);
        await wholeImageData.release();
        await wholeImage.release();
        await context.invalidate();
      }
      image.release();
    },
    [contextRef, layout],
  );

  return (
    <>
      <Camera
        onFrame={handleCapture}
        hideCaptureButton={true}
        style={styles.camera}
        facing={CameraFacing.BACK}
      />
      <Canvas
        style={styles.canvas}
        onContext2D={context => {
          contextRef.current = context;
        }}
        onLayout={event => {
          setLayout(event.nativeEvent.layout);
        }}
      />
    </>
  );
};

const styles = StyleSheet.create({
  camera: {
    display: 'none',
  },
  canvas: {
    flex: 1,
  },
});
export default App;

However, the code below is necessary for our application and cannot be removed.

        const wholeImage = await ImageUtil.fromImageData(wholeImageData);
        await wholeImageData.release();
        await wholeImage.release();
        await context.invalidate();
$ vim ./ios/AwesomeTSProject/Info.plist
<key>NSCameraUsageDescription</key>
<string>$(PRODUCT_NAME) needs access to your Camera.</string>
<key>NSLocationWhenInUseUsageDescription</key>
<string></string>
<key>NSMicrophoneUsageDescription</key>
<string>$(PRODUCT_NAME) needs access to your Microphone.</string>
$ npx react-native start
$ open /Applications/Xcode.app ./ios/AwesomeTSProject.xcworkspace

Expected Results

No response

Code example, screenshot, or link to repository

Related to this issue

raedle commented 2 years ago

@SomaKishimoto, I can't repro the issue with the provided code and an iPhone Xs (see screenshots at the start of the memory measurement and after 1 hour continuously running the code).

Start

Screen Shot 2022-11-26 at 8 13 56 PM

End

Screen Shot 2022-11-26 at 9 14 18 PM
SomaKishimoto commented 2 years ago

@raedle , thanks for trying to repro the issue. my answers below

What iOS device was used in the experiment?

  • device : iPhone SE 2nd
  • OS : 16.11

What JavaScript runtime is enabled in the app (e.g., Hermes or JSC)?

setting

Memory heap seems to increase slightly once every 10 minutes. Do you use Expo? I create empty project with bare react native. If it's possible, could you try to repro this issue with bare react native?

Start

image_1

30min

image_2

55min

image_3 (Stopped with an error unrelated to this issue [error.txt] )

nh9k commented 1 year ago

Hi, @raedle, Does not doing image.release() or tensor release(current not implement) properly cause a memory leak?

raedle commented 1 year ago

@nh9k, each image has to be released from memory manually. For context, the current implementation holds a reference in the JSContext:

Not releasing the image will keep the reference around.

Tensors are implemented differently. They use jsi::HostObject from the JSI API. This will eventually clean up references when the JavaScript Runtime GC finds unused references (see TensorHostObject.h).

Have you profiled your app to check what leaks memory?

nh9k commented 1 year ago

@raedle, thank you for your kindness! I haven't profiled my app yet. My app also crashes, so I need a memory check. I'll report back!

nh9k commented 1 year ago

Hi, @raedle, i found out what leaks memory!!!

This code causes app crash when capture under about 100 times on Android. A large image size causes a memory leak rapidly.

For example, the size of an image are 3000x4000. This example size is captured image size using react-native-camera package.

for-loop was used instead of pressing the capture button 100 times. Here is the code crashes the app on the 37th i variable using for-loop when tested by Galaxy A33 . Galaxy S9+ crashes at 80th i variable in for-loop.

for (let i=0; i<100; i++){
 const width = image.getWidth();
 const height = image.getHeight();
 const blob = media.toBlob(image);
 let tensor = torch.fromBlob(blob, [height, width, 3]);
 tensor = tensor.permute([2, 0, 1]); 
 const newTensor = tensor.div(255);
}

Here is a Snack for test. :cry:

raedle commented 1 year ago

Thanks for investigating in the issue, @nh9k.

My hunch is that it is not a memory leak per se, but the Hermes/JSC garbage collector will not run in fast enough intervals to free memory of variables that are out of scope (i.e., the tensor variables inside the for-loop aren't GC'ed fast enough and the system eventually runs out of memory).

This seems like a valid case for introducing manual memory management options to PlayTorch (although this will break with idiomatic JavaScript).

cc @chrisklaiber @justinhaaheim

nh9k commented 1 year ago

Thanks @raedle. My app crashes are also occurs about 27th capture on Galaxy A33, it is similar to for-loop of test.