tauri-apps / tauri

Build smaller, faster, and more secure desktop and mobile applications with a web frontend.
https://tauri.app
Apache License 2.0
84.37k stars 2.54k forks source link

[bug] Memory leaks when reading files #9190

Open MagicMajky opened 7 months ago

MagicMajky commented 7 months ago

Describe the bug

When reading larger files (140MB+) the code never gets past reading them. Tauri tries to eat all of my memory and then freezes/crashes. I am on windows and my system has 16GB of RAM and when trying to read 140MB file Tauri tries to eat all of it, going up to 11,5 GB of RAM usage (I use task manager for measuring). Then it propably goes out of memory and crashes.

Even for smaller files like 50-80MB it uses like 2-5 GB of RAM on my machine.

Tested on v1 (readBinaryFile) and also on beta v2 (readFile).

Reproduction

I use readBinaryFile() to read the contents of a file. To reproduce create new tauri app with sh <(curl https://create.tauri.app/sh) and use this minimal code to try to read a file.

import { readBinaryFile } from '@tauri-apps/api/fs';
import { open } from '@tauri-apps/api/dialog';

async function attemptToRead(){
    const selected = await open({multiple: false});
    if (typeof selected != "string"){
      console.log(typeof selected + " is not a string");
      return;
    }

    console.log("Start reading");
    const content = await readBinaryFile(selected);
    console.log("Read succesfull"); // never logged 
}

I also tried it in the v2 beta version with essentially the same code and got the same result - exceptionally high memory usage followed by crash. sh <(curl https://create.tauri.app/sh) --beta

import { readFile } from '@tauri-apps/plugin-fs';
import { open } from '@tauri-apps/plugin-dialog';

async function attemptToRead() {
  const selected = await open({
    multiple: false,
    directory: false,
  });
  if (!selected?.path){
    console.log("bad select");
    return
  }
  console.log("Started reading");
  const result = await readFile(selected.path);
  console.log("Finished reading");
}

Expected behavior

Not using 11+ gigs of RAM to read 140MB file.

Full tauri info output

V1:
[✔] Environment
    - OS: Windows 10.0.22631 X64
    ✔ WebView2: 122.0.2365.80
    ✔ MSVC: Visual Studio Build Tools 2022
    ✔ rustc: 1.76.0 (07dca489a 2024-02-04)
    ✔ cargo: 1.76.0 (c84b36747 2024-01-18)
    ✔ rustup: 1.26.0 (5af9b9484 2023-04-05)
    ✔ Rust toolchain: stable-x86_64-pc-windows-msvc (default)
    - node: 20.11.0
    - yarn: 1.22.22
    - npm: 10.2.4

[-] Packages
    - tauri [RUST]: 1.6.1
    - tauri-build [RUST]: 1.5.1
    - wry [RUST]: 0.24.7
    - tao [RUST]: 0.16.7
    - @tauri-apps/api [NPM]: 1.5.3
    - @tauri-apps/cli [NPM]: 1.5.11

[-] App
    - build-type: bundle
    - CSP: unset
    - distDir: ../dist
    - devPath: http://localhost:1420/
    - framework: Svelte
    - bundler: Vite

V2:
[✔] Environment
    - OS: Windows 10.0.22631 X64
    ✔ WebView2: 122.0.2365.80
    ✔ MSVC: Visual Studio Build Tools 2022
    ✔ rustc: 1.76.0 (07dca489a 2024-02-04)
    ✔ cargo: 1.76.0 (c84b36747 2024-01-18)
    ✔ rustup: 1.26.0 (5af9b9484 2023-04-05)
    ✔ Rust toolchain: stable-x86_64-pc-windows-msvc (default)
    - node: 20.11.0
    - yarn: 1.22.22
    - npm: 10.2.4

[-] Packages
    - tauri [RUST]: 2.0.0-beta.11
    - tauri-build [RUST]: 2.0.0-beta.9
    - wry [RUST]: 0.37.0
    - tao [RUST]: 0.26.1
    - @tauri-apps/api [NPM]: 2.0.0-beta.5
    - @tauri-apps/cli [NPM]: 2.0.0-beta.9

[-] App
    - build-type: bundle
    - CSP: unset
    - frontendDist: ../dist
    - devUrl: http://localhost:1420/
    - bundler: Vite

Stack trace

No response

Additional context

I tested this with only PDF files but I dont think this issue is specific to them. Here are the pdfs I used for testing: The pdfs are just pdf specification merged with it self bunch of times.

141 MB: https://mega.nz/file/WmwCTZqL#Jt30rUsjd1F5QG5prnKN5wsRm1J4MesurpmQ7XnAi5g 215 MB: https://mega.nz/file/ivQlWDrR#EdscaHf0YiBkPaR5oqZJPNuLb84D4VppEmUAsL9W76g 350 MB: https://mega.nz/file/3nwmUJqD#hPqPRuqTlvnahbINfWZsV2BibiWCuMkulGLU4_Pn8eY

Am a doing something wrong when reading the files? Is there a workaround?

MagicMajky commented 7 months ago

The leak/bloat seems to happen when sending the response from the rust side to the frontend. The rust code reads the file fine but when returning the contents to FE it goes crazy on the memory.

#[tauri::command]
fn custom_read(path: &str) -> Result<Vec<u8>, String>{
    let result = fs::read(path);
    if result.is_ok() {
        println!("File read SUCCESS");
        return Ok(result.unwrap()); // before the return memory is fine, after return it gets really bad
    }
    println!("File read FAILED");
    return Err("Failed reading the file".into());
}

I would have love to help more to solve this bug and I even tried to debug my way trough whats happening but honestly since my knowledge of rust or how tauri works internally is very limited my efforts are unsuccessful so far.

Is someone looking into this or could somebody atleast point me in the direction of the issue (like where the serialization of command responses happens) please?

This bug is in my application, where I work with pdfs and sometimes my users need to work with larger pdfs too. Right now I simply hardlimited the file size to 50MB but this solution is really not ideal. Even with this hard limit it might still crash for some users with less available ram.

Because this bug significantly limits my production application I am looking for any solution that will stop the memory leak/bloat ASAP.

FabianLars commented 7 months ago

I will discuss this here instead of #4026 to not spam notifications for all those users until we have something presentable.

Just to confirm that a) we're sending Vec as InvokeBody::Json and not InvokeBody::Raw (therefore not actually using the new IPC as intended) and b) converting Vec to serde_json::Value is a really really bad idea, i wrote the most disgusting code i've ever written: https://github.com/tauri-apps/tauri/compare/dev...triage/ram-usage

This branch reduced the memory usage for the 141mb pdf file from ~5gigs (plus webview OOM warning) to 300-700mb (windows task manager trying its best i guess).

To be precise the delay + memory usage happens here: https://github.com/tauri-apps/tauri/compare/dev...triage/ram-usage#diff-f5b23dca1c1951b8661b16b558998059ed8f2029ec6806392dcd430ddec82f02R472 because Vec implements Serialize this will always be used. I believe that we can't solve this with stable rust in that code location but hopefully i'm just being stupid rn. Otherwise i hope @lucasfernog remembers enough about the IPC to solve this before it gets to that function, because i don't😅

Afterwards we should ideally look into #5641 mainly because of https://github.com/serde-rs/json/issues/635. I didn't look into that part yet so i don't know if we actually need serde_json::Value for anything, but at first glance it sounds weird.

lucasfernog commented 7 months ago

I'm trying something (90% there, let's see if the futures won't bite me).

lucasfernog commented 7 months ago

I wanted to use the specialization stuff that @chippers wrote (https://github.com/tauri-apps/tauri/compare/refactor/ipc-response?expand=1) it's easy to use autoref specialization for sync commands that returns Vec but it gets tricky for Result<Vec> and async commands.. one thing we could do is actually parse the command return type on the macro level and correctly determine which async_kind to use.. but idk if there's a better way.

The idea from @FabianLars is good but it still has the overhead of going through serde..

lucasfernog commented 7 months ago

The easy solution: document that people should use the tauri::ipc::Response type when returning a Vec

tance77 commented 6 months ago

@lucasfernog @MagicMajky I have a similar issue and I do not know if it is the same, but if I was able to use TestLimit with the -d flag to cause a memory leak. Then have my tauri app run a resource intensive event over the IPC channel. 10-15 (ball parking it could have been fewer) at a time just for debugging purposes. Then the Windows OS crashed the Webview2.

The app didn't panic or anything it just crashed.

WalrusSoup commented 6 months ago

To add to this - we have a handful of users experience crashing which present as a wild amount of variations from within Windows Error Reporting:

Sig[6].Name=Exception Code
Sig[6].Value=c000001d
FriendlyEventName=Stopped working
ConsentKey=APPCRASH
Sig[6].Name=Exception Code
Sig[6].Value=c0000374
Sig[7].Name=Exception Offset
Sig[7].Value=PCH_E4_FROM_ntdll+0x000000000009FEB4
Sig[6].Name=Exception Code
Sig[6].Value=c0000005
Sig[7].Name=Exception Offset
Sig[7].Value=000000000000faf4

However, the user was able to produce a Webview2 crash minidump:

Key  : Failure.Bucket
Value: APPLICATION_FAULT_e0000008_msedgewebview2.exe!partition_alloc::internal::OnNoMemoryInternal
000000f4`cadfe2f0 00007ff7`4d464d42     : 00000000`00000048 00004eb4`0024dbd0 00000000`00000108 00007ff7`4d25b2a8 : KERNELBASE!RaiseException+0x6c
000000f4`cadfe3d0 00000000`00000048     : 00004eb4`0024dbd0 00000000`00000108 00007ff7`4d25b2a8 00000000`00070000 : msedgewebview2!partition_alloc::internal::OnNoMemoryInternal+0x22
000000f4`cadfe3d8 00004eb4`0024dbd0     : 00000000`00000108 00007ff7`4d25b2a8 00000000`00070000 00007ff7`4d464d59 : 0x48
000000f4`cadfe3e0 00000000`00000108     : 00007ff7`4d25b2a8 00000000`00070000 00007ff7`4d464d59 000000f4`cadfe410 : 0x00004eb4`0024dbd0
000000f4`cadfe3e8 00007ff7`4d25b2a8     : 00000000`00070000 00007ff7`4d464d59 000000f4`cadfe410 00000213`8ba60140 : 0x108
000000f4`cadfe3f0 00000000`00070000     : 00007ff7`4d464d59 000000f4`cadfe410 00000213`8ba60140 000000f4`cadfe558 : msedgewebview2!crashpad::ProcessMemoryWin::ReadUpTo+0x78
000000f4`cadfe3f8 00007ff7`4d464d59     : 000000f4`cadfe410 00000213`8ba60140 000000f4`cadfe558 00000000`00000000 : 0x70000
000000f4`cadfe400 000000f4`cadfe410     : 00000213`8ba60140 000000f4`cadfe558 00000000`00000000 00000213`8d700e00 : msedgewebview2!partition_alloc::TerminateBecauseOutOfMemory+0x9
000000f4`cadfe408 00000213`8ba60140     : 000000f4`cadfe558 00000000`00000000 00000213`8d700e00 00007ff7`4d464d75 : 0x000000f4`cadfe410
000000f4`cadfe410 000000f4`cadfe558     : 00000000`00000000 00000213`8d700e00 00007ff7`4d464d75 00000000`01000000 : 0x00000213`8ba60140
000000f4`cadfe418 00000000`00000000     : 00000213`8d700e00 00007ff7`4d464d75 00000000`01000000 00000213`8d7000bf : 0x000000f4`cadfe558

Which contains:

000000f4`cadfe400 000000f4`cadfe410     : 00000213`8ba60140 000000f4`cadfe558 00000000`00000000 00000213`8d700e00 : msedgewebview2!partition_alloc::TerminateBecauseOutOfMemory+0x9

As @tance77 mentioned above, we were able to simulate crashes when we eat up too much memory and then try to send IPC events. It's possible it's trying to allocate and the user does not have enough memory due to a leaky app, or that we are the leaky app ourselves?

We're trying to figure our exactly what our footprint is, but we haven't been able to find any memory leaks in our application. We may need to ship a small binary to just monitor performance and memory usage system wide so we can produce some logs that help us figure out if we're the problem or not. Either way, Tauri itself simply goes poof and it's gone.

MagicMajky commented 6 months ago

The easy solution: document that people should use the tauri::ipc::Response type when returning a Vec

For v2 this was indeed significant improvement. Reading larger files no longer crashes. I have tried to push it a little and tried reading ~600MB. It used ~4GB RAM on my machine. This is significantly better than before. Before I couldn't even dream of reading file this big but I am still not sure if 4GBs of RAM is necessary (maybe it is) can this be further improved?. Although for reading files this big it would probably be better to read them in streamlined manner rather than all at once.

If I wanted to read large file lets say +1GB and I wanted to read them with streams how could I achieve this from the frontend side?

For me the solution so far is to have custom read functions and don't use the provided tauri functions. Here is my code that I used when testing the 600MB file.

use tauri::ipc::{Response, InvokeBody};

#[tauri::command]
fn custom_read(path: &str) -> Response {
    match fs::read(path){
        Ok(data) => {
            println!("File read succesfully");
            return Response::new(InvokeBody::Raw(data));

        }
        Err(e) => {
            // not sure how to handle the error here
            return Response::new(InvokeBody::Raw(vec![]));
        }
    }
} 
async function customOpenFile(){
  const selected = await open({multiple: false, directory: false});
  if (!selected?.path){
    console.log("select cancelled");
    return
  }
  console.log("Start reading");
  const content = await invoke("custom_read", {path: selected.path});
  console.log("Read succesfull");
  // do stuff with the file
}

This doesn't work on v1 since there is no tauri::ipc::Response right @lucasfernog ? Is there something on v1 that can be done (other then migrating to v2)?

FabianLars commented 6 months ago

On v1 really the only thing is to use a separate http server.

The v1 ipc forces a json string even for binary data. The alternative (which is almost what we use in v2) are custom protocols but in v1 they are sync which means they will freeze the app while loading the data.


I do think the 4gb usage is still concerning though btw. I didn't try IpcResponse myself yet but it's still weird how it would take more ram than my super ugly solution that duplicates the data.