apache / cordova-plugin-file

Apache Cordova File Plugin
https://cordova.apache.org/
Apache License 2.0
741 stars 756 forks source link

Large file save freeze app for seconds #364

Open Rigwarl opened 4 years ago

Rigwarl commented 4 years ago

Bug Report

Problem

What is expected to happen?

On large file save (> 5mb) app freeze for seconds. Its pause render in my app, and animations start lag and jump.

What does actually happen?

No freeze or lag in render on file save.

Information

Command or Code

const save = (data, filename) => {
  const onErrorHandler = e => console.error('file save error ', filename, e);

  window.resolveLocalFileSystemURL(cordova.file.externalDataDirectory, dirEntry => {
    dirEntry.getFile(filename, { create: true, exclusive: false }, fileEntry => {
      fileEntry.createWriter(fileWriter => {
        fileWriter.onwriteend = () =>
          console.log('Successful file write ', filename);
        fileWriter.onerror = onErrorHandler;
        fileWriter.write(data);
      }, onErrorHandler);
    }, onErrorHandler);
  }, onErrorHandler);
};

Environment, Platform, Device

Version information

Checklist

breautek commented 4 years ago

Where is data coming from and how are you serialising it?

I have an app where I write a rather large amount of data to a file, upwards of 50mb in size and I don't really see a huge freeze. FileWriter is asynchronous so it should not be pausing the webview. But in my app, the most expensive operation was JSON.stringify/JSON.parse.

Rigwarl commented 4 years ago

I download png files with xhr and save them without any processing. Ye, i know its asynchronous, so its very strange.

breautek commented 4 years ago

When you download your png files, are you using blobs or are you referencing that data via strings?

Rigwarl commented 4 years ago

I download it as blob.

breautek commented 4 years ago

Very strange indeed. If possible, if you can create a sample reproduction app that reproduces the issue, it might help move this ticket along, though this may very well be device dependent.

Rigwarl commented 4 years ago

Hello again! I made a sample reproduction app, it downloads 9mb png file and save it on disc, this cause around 4sec lag on all devices i tested.

Click on load button for load file, then click save button for save. =)

breautek commented 4 years ago

Thanks, I'll try to take a quick look at this this weekend and report what I find on my devices.

Rigwarl commented 4 years ago

Hi, any update on this?

breautek commented 4 years ago

My apologies, I've been really tied up with work trying to meet a deadline and I completely forgot about this.

I'll take a look now just to at least confirm that I can reproduce the issue, but I'm not sure when I'll be able to take a deep look.

Edit: Definitely can see the ~4 second pause when I click the "Save Image" button. I'll have to investigate further to see exactly where the bottleneck is, but I'm not sure when I'll have the time to do so since I've been doing some overtime at my workplace.

breautek commented 4 years ago

The line that blocks the UI thread for a significant amount of time here I found to be:

https://github.com/apache/cordova-android/blob/c35a990c09f1ab278873754eb7f81e52250893a3/cordova-js-src/exec.js#L87

args[i] = base64.fromArrayBuffer(args[i]);

When blobs are used, it reads the blob as an arraybuffer. Then it converts the array buffer to a base64 encoded string. I assume it does this because the JavasciptInterface I believe can only accept strings. Unfortunately, I don't think there is any way around that, based on some brief googling around.

This is a weak workaround since this plugin is deprecated, you can try using the file transfer plugin which allows you to download a file and save it to the file system completely on the native side. I do still use the file transfer plugin myself in my own apps and it still works, but it is a plugin that cordova doesn't provide updates or support for.

Possible ideas for solutions:

Cordova doesn't support web workers, but I think this would be a good case where web workers would be beneficial, to at least move the bridge execution calls off the UI thread. It won't improve performance, but at least it will make exec non-blocking.

Another idea is to bring the concept of file transfers into the file plugin downloading and saving to the filesystem can be handled completely on the native side to avoid dealing with binary data in the javascript environment.

Rigwarl commented 4 years ago

My knowledge of web workers is very limited.

I think i cant make cordova calls from worker, so i need to pass my file to worker, convert it to base64 in worker, pass it back to main and call save() ?

breautek commented 4 years ago

I think i cant make cordova calls from worker

You are correct, this is something that cordova doesn't currently support.

so i need to pass my file to worker, convert it to base64 in worker, pass it back to main and call save()

You can try this, but I'm not certain if that will help as there is a bottleneck of copying the data to and from the worker.

If you're downloading a file and saving it to disk, you're best/quickest work around is probably using the File Transfer Plugin despite it being deprecated. Because it downloads the file and saves it to the filesystem all on the native side.

The primary reason why the transfer plugin is deprecated to my understanding is because it is possible to download binary files using XMLHttpRequest (however, then you can run into performance/efficiency issues like you are currently experiencing). Despite the plugin is being deprecated, I still use it to upload large files to my server, and that part is at least working for me still.

digaus commented 4 years ago

@breautek

We can convert the blob into chunks and write the file in parts:

    private async convertToBase64Chunks(blob: Blob, size: 3 | 6 | 9 | 12 | 15 | 18, writeChunk: (value: string, first?: boolean) => Promise<void>): Promise<void> {
        const chunkSize: number = 1024 * 1024 * size;
        const blobSize: number = blob.size;
        while (blob.size > chunkSize) {
            const value: string = await this.convertToBase64(blob.slice(0, chunkSize));
            await writeChunk(blobSize === blob.size ? value : value.split(',')[1], blobSize === blob.size);
            blob = blob.slice(chunkSize);
        }
        const lastValue: string = await this.convertToBase64(blob.slice(0, blob.size));
        await writeChunk(lastValue.split(',')[1], blobSize === blob.size);
        blob = blob.slice(blob.size);
    }

    private convertToBase64(blob: Blob): Promise<string> {
        return new Promise((resolve: (base64: string) => void, reject: () => void): void =>  {
            let reader: FileReader = new FileReader();
            const realFileReader: FileReader = reader['_realReader'];
            if (realFileReader) {
                reader = realFileReader;
            }
            reader.onerror = (err: any): void => {
                console.log(err);
                reject();
            };
            reader.onload = (): void => {
                resolve(reader.result as string);
            };
            reader.readAsDataURL(blob);
        });
    }

writeChunk is the asynchronus callback where we can call the write method of the plugin. first indicates the first part which also has the encoding prefix (but I think this is dropped anyways).

Implemented this in a Capacitor project which works very nicely:

            await this.convertToBase64Chunks(blob, chunkSize, async (value: string, first: boolean): Promise<void> => {
                if (first) {
                    await Filesystem.writeFile({
                        path: this.folder + '/' + fileName,
                        directory: FilesystemDirectory.Data,
                        data: value,
                    });
                } else {
                    await Filesystem.appendFile({
                        path: this.folder + '/' + fileName,
                        directory: FilesystemDirectory.Data,
                        data: value,
                    });
                }
            });

Edit: 20MB took arround 2 seconds to write

maksymov commented 4 years ago

I have <input type='file'> in my form. Now I want to save selected file in LocalFileSystem.PERSISTENT.

My code for that:

self.requestFileSystem(LocalFileSystem.PERSISTENT, 0, function (fs) {
  fs.root.getFile(file_name, { create: true, exclusive: false }, function (fileEntry) {
    var reader = new FileReader();
    reader.onloadend = function () {
      fileEntry.createWriter(function (fileWriter) {
        fileWriter.onwriteend = function () {};
        fileWriter.onerror = function (e) {};
        dataObj = new Blob([reader.result], { type: 'text/plain' });
        // freeze here
        fileWriter.write(dataObj);
        postMessage('ok');
      });
    };
    reader.readAsDataURL($(file_id)[0].files[0]);
  }, onErrorCreateFile);
}, onErrorLoadFs);

But this code freeze my UI on device (ios, android) up to 10 seconds when file is 30MB. I discovered the world of Web Workers, but there is no access to LocalFileSystem.

How can I save selected file (<input type='file'>) to my LocalFileSystem without UI freezes?

HarelM commented 3 years ago

I have a similar issue when downloading a large file (100Mb) in iOS. but in my case the app completely freezes and goes to white screen of death. Unfortunately this doesn't happen on all the devices just some, which is very hard to narrow down the root cause. I'll try and use the File Transfer Plugin and see if it helps. If anyone had found a workaround besides the not supported plugin, let me know...

ssmithereens commented 3 years ago

We are also hitting this issue when downloading large video files (>25MB) on iOS. Even worse, wkwebview decides that the page is unresponsive, and reloads it, terminating our download task before it completes successfully.

ssmithereens commented 3 years ago

We tried the chunking method described above, but unfortunately wkwebview still eventually decides that the page is unresponsive. We experimented with chunk sizes from 512KB to 15MB.

HarelM commented 3 years ago

File transer plugin was somewhat "undeprecated", was upgraded to be available for latest cordova and shows good performance and results. Solved most of the issues my app was experiencing and it can also download when the app is not inn the front which is a nice plus. I recommend using it.

ssmithereens commented 3 years ago

File transer plugin was somewhat "undeprecated", was upgraded to be available for latest cordova and shows good performance and results. Solved most of the issues my app was experiencing and it can also download when the app is not inn the front which is a nice plus. I recommend using it.

We just tested the file transfer plugin and it works perfectly for us. Thank you.

LightMind commented 3 years ago

We were running into some of the same issues. We try to download files via XMLHttpRequest, and write the contents via a fileWriter. Writing files larger than a few megabytes would consistently crash the whole application. We switched to writing the files in chunks, which alleviated the problem, but I was still confused why writing files less than 50MiB would result in crashes or white-screens.

I profiled with chromes debugging tools, and found that the garbage collector is running like crazy, and the function uses a lot of memory. You can see here, that while invoking .writeFile that the memory usage goes from 11MiB to about 40MiB, even though we are writing 1MiB chunks of binary data. Screenshot 2021-02-03 at 13 58 41

As already mentioned by @breautek, this part in cordova is really slow: args[i] = base64.fromArrayBuffer(args[i]);

I wanted to understand what is happening here: Taking a look at the base64 module in cordova-common:

function uint8ToBase64 (rawData) {
    var numBytes = rawData.byteLength;
    var output = '';
    var segment;
    var table = b64_12bitTable();
    for (var i = 0; i < numBytes - 2; i += 3) {
        segment = (rawData[i] << 16) + (rawData[i + 1] << 8) + rawData[i + 2];
        output += table[segment >> 12];
        output += table[segment & 0xfff];
    }
    if (numBytes - i === 2) {
        segment = (rawData[i] << 16) + (rawData[i + 1] << 8);
        output += table[segment >> 12];
        output += b64_6bit[(segment & 0xfff) >> 6];
        output += '=';
    } else if (numBytes - i === 1) {
        segment = (rawData[i] << 16);
        output += table[segment >> 12];
        output += '==';
    }
    return output;
}

Ouch. JavaScript has immutable strings. Using += to concat the output creates new strings in memory all the time. This means if we need to convert n bytes to base64, we create at least 2 * n/3 new strings, each bigger than the last. This ends up costing O(n^2) bytes of memory. ( "AA", "AABB", "AABBCC" ... all the way to the full string )

Note that if you want to write 1MiB, you would have to allocate several gigabytes of memory!

This explains the memory usage, and why writing binary files is slow. The garbage collector needs to clean up millions of strings, and if it can't keep up, we run out of memory.

This not only affects writing files, but ANY cordova plugin that wants to pass binary data from JS to the native layer.

The conversion via FileReader.readAsDataURL that @digaus used is infinitely better than the cordova-common algorithm.

I thought it would be reasonable to support FileWriter.write(buffer: ArrayBuffer), and let it do the chunking and conversion to base64, to bypass the cordova algorithm. Then users of this plugin would not need to implement custom chunking code.

So I started to change the FileWriter.js code. The files are written to disk, but I must have made a mistake somewhere, because the files seem corrupted ( I have removed the prefix from the readAsDataURL result ).

I am also not sure why @digaus solution works. When I pass a string to FileWriter.write(), it should write the strings as-is to disk, instead of interpreting it as binary data encoded in base64? In www/FileWriter.js, we need to pass isBinary = true here:

}, 'File', 'write', [this.localURL, data, this.position, isBinary]);

In order to get LocalFileSystem.java writeToFileAtURL() to decode it to binary data.

Does FileSystem.writeFile behave differently?

I will spend some more time on this. I would really like to solve this issue and make a PR, once I get this to work.

LightMind commented 3 years ago

I have managed to change FileWriter, such that it can convert ArrayBuffers to base64 encoded strings itself.

To get the correct base64 encoded string, it is important to call FileReader.readAsDataURL in the right way;

   fileReader.readAsDataURL(
        new Blob([arrayBuffer], {
            type: 'application/octet-binary'
        })
    );

Otherwise you would encode the string representation of the ArrayBuffer, which is not at all what you want. Link to the changed file

I have used this to write 1MiB and smaller files, and the performance looks better for now. Especially the memory usage for each file or chunk is only about twice the filesize/chunksize. There is less garbage collection happening.

I will test this with bigger files and measure the timing in comparison to the original version, to see if it actually performs better, and implement chunking, if that seems necessary.

timbru31 commented 3 years ago

Thanks for the detailed analysis and potential fix(es) @LightMind - looking forward to hearing more from you.

LightMind commented 3 years ago

I have implemented the chunking behaviour in FileWriter.js. It seems pretty necessary to do chunking, because even though we skips the args[i] = base64.fromArrayBuffer(args[i]); part, cordova.js still calls JSON.encode on all the params we want to pass to the native layer.

This does not go so well for several megabyte of data.

I tried with chunks of 128KiB, 1MiB, and 5MiB.

Larger chunks are faster to write, having less downtime between writes, but the memory spikes are worse. Memory usage is mostly due to JSON.encode, as far as I can debug. When writing one chunk of 5 MiB, my android-phone would use about 30 MiB. Speeds were around 4.5 to 4.8 MiB/second ( measured from calling FileWriter.write, to succesCallback ). Writing 5MiB chunks looks like this: Screenshot 2021-02-10 at 17 34 23

When writing 1MiB chunks, there is more downtime, but much less memory pressure. Spikes of less than 10MiB memory usage, and write speed is between 4.2 to 4.5. So, sligthly slower, but more stable.

Writing smaller chunks does not make sense, as the downtime between writes is too much, halving the write speed at chunks of 128KiB.

I have some questions, but I'll take that when I make a PR, which I will do soon.

kputh commented 2 years ago

If this is an issue with cordova-common, did you report it? I'd like the see this issue's status in the repo that owns it.

erisu commented 2 years ago

@kputh This cant be an issue from cordova-common. cordova-common is not bundled with any app.

From the content of this thread, it sounds like you might mean cordova-js. This file specifically: https://github.com/apache/cordova-js/blob/master/src/common/base64.js

raphinesse commented 2 years ago

I filed a PR in cordova-js that tries to improve the performance for big files as much as is possible with the current architecture. However, I think the only solid solution to this problem is to not process big files in the browser's main thread.

It seems the file-transfer-plugin caters well to the common use case of simply downloading and storing files. Chunked reading and writing seems to be a reasonable alternative if you really need to transfer a lot of bytes over the web/native bridge. However, I'm not quite sure if we should promote that by including it into this plugin.

One thing that we should definitely consider for this plugin is the move from the synchronous base64.fromArrayBuffer to using FileReader.readAsDataURL to convert files to Base64 strings.

@LightMind Thanks so much for your detailed analysis of the problem. You really pushed this issue forward!

LightMind commented 2 years ago

@raphinesse Thank you so much, I really appreciate it :)

I am also a little bit worried about changing the behaviour of FileWriter to always use chunks, because it can potentially give problems with partial writes. Writing to a file feels to me like something that should be atomic, if possible.

This makes me unsure about if or how to continue https://github.com/apache/cordova-plugin-file/pull/461

Would it be reasonable to provide a ChunkedFileWriter ( And maybe a ChunkedFileReader )? It could implement the chunked writing behaviour on top of FileWriter.

Something like

fileEntry.createWriter(async (fileWriter) => {
    var chunkedWriter = new ChunkedFileWriter(fileWriter)
    chunkedWriter.setChunkSize(2 ** 15)

    chunkedWriter.onwriteend = function()
    ...
raphinesse commented 2 years ago

@LightMind First off: sorry for the late reply.

Writing to a file feels to me like something that should be atomic, if possible.

Agreed. Maybe a ChunkedFileWriter would delete the whole file if a chunk write fails?

Would it be reasonable to provide a ChunkedFileWriter ( And maybe a ChunkedFileReader )? It could implement the chunked writing behaviour on top of FileWriter.

If we want a chunked writer in this plugin, I think that would be the way to go.

Another interesting approach can be seen in capacitor-blob-writer. It streams the data out of the WebView by means of network requests. Seems more reasonable than pushing everything over the bridge in base64 form. Moreover, this approach seems to grant great performance. A pity that we actually need a local server running on the device for that.