pionl / laravel-chunk-upload

The basic implementation for chunk upload with multiple providers support like jQuery-file-upload, pupload, DropZone and resumable.js
MIT License
617 stars 167 forks source link

Server CPU High Usage on file upload. #72

Closed guigoebel closed 2 years ago

guigoebel commented 5 years ago

I'm using this lib in my project and I have a High CPU usage when upload a file. (70% of usage in a single file). This maybe can be a problem on production or not? Maybe I have a bad usage from this lib?

nerg4l commented 5 years ago

I would suggest to try the following solutions:

  1. Increase the size of the uploaded chunks. Maybe they are too small.
  2. In Pion\Laravel\ChunkUpload\FileMerger try to increase the buffer size from 4096
  3. In Pion\Laravel\ChunkUpload\FileMerger instead of the while loop try to use stream_copy_to_stream
pionl commented 4 years ago

HI @guigoebel did the buffer size or stream_copy_to_stream helped? We can make a change.

DanielApodaca commented 4 years ago

Hi, i think my server has similar issues, on my system everything works fine, but when i deploy it to my shared hosting on godaddy (2 core cpu and 1gb of ram) it crashes when it gets to the last chunk with a big file it hangs and throws a 503 error with a 500mb file, with a 20mb file it does work but it takes some time to merge all the chunks 30 seconds or so, but with the bigger file crashes after 6 minutes (with 50mb chunks), i've played with the chunksize for this last 2 days (from 1mb, 2mb, 5, 10, to 50 and 100mb) but still crashes. Do you have an example of using stream_copy_to_stream?

Im gonna try using a bigger buffer size first (didn't worked on 8192)

Last chunk sent:

dzuuid: 831cca30-cb90-4c49-bbdb-40023ad5a728
dzchunkindex: 469
dztotalfilesize: 469581690
dzchunksize: 1000000
dztotalchunkcount: 470
dzchunkbyteoffset: 469000000
file: (binary)

Controller:

public function upload(Request $request) {
      $receiver = new FileReceiver('file', $request, HandlerFactory::classFromRequest($request));
      if ($receiver->isUploaded() === false) {
          throw new UploadMissingFileException();
      }
      $save = $receiver->receive();
      if ($save->isFinished()) {
          return $this->saveFileToS3($save->getFile());
      }
      /** @var AbstractHandler $handler */
      $handler = $save->handler();

      return response()->json([
          'done' => $handler->getPercentageDone(),
          'status' => true
      ]);
  }

    protected function saveFileToS3($file)
    {
        $videoName = strtolower(str_replace(' ', '-', uniqid('video_test_') . '.mp4'));

        $video = new Video;
        $video->status = 2;
        $video->filename = $videoName;
        $video->type = 'Indoor';
        $video->chunk_name = $file->getFilename();
        $video->chunk_path = $file->getPathname();
        $video->instructor_name = 'Speed';
        $video->save();

        UploadVideoToCloud::dispatch($video->id)->onQueue('jobs');

        return response()->json($video, 200, [], JSON_NUMERIC_CHECK);
    }

Queue to move from laravel to s3

public function handle()  {
       $video = Video::find($this->id);
       $disk = Storage::disk('s3')->put($video->filename, \fopen($video->chunk_path, 'r+'), 'public');
       unlink($video->chunk_path);
       $video->status = 0;
       $video->save();
 }
shaunclark5649 commented 4 years ago

I am having the exact same issue with our platform , works fine with 50mb files, 200mb video files seems to cause the issue, we did a manual merge of the file chunks with cat file1.chunk file2.chunk file3.chunk > finalFile.mp4 and this seems to merge the file together with almost zero memory and cpu footprint.

Without looking at the actual file merger, I'm not sure where we would apply this or similar logic as it seems to be done automatically within the library.

@pionl any help would be greatly appreciated.

Cheers

DanielApodaca commented 4 years ago

Hi there, here is the file that manages the merge i think: https://github.com/pionl/laravel-chunk-upload/blob/master/src/FileMerger.php

Without looking at the actual file merger, I'm not sure where we would apply this or similar logic as it seems to be done automatically within the library.

I honestly ended up changing my client project to AWS, but still with large files (2gb) takes like 30 seconds to merge and the server has a default limit of 60 secs, im not sure if its possible to optimize the merge it looks just like a simple while loop

shaunclark5649 commented 4 years ago

I am currently testing something that might of be of use to you, I will post an update very soon with the result, hopefully it should allow you to reduce your stack and perform much more leaner merges without causing you to have such a large memory footprint and alot faster than 30 seconds.

@DanielApodaca

Cheers

shaunclark5649 commented 4 years ago

So after hours and hours of testing and headaches, I came to a finer solution for our use case at least.

We are using resumable.js for our uploading just FYI.

We set the chunk size for desktop at 75mb per chunk and mobile 25mb per chunk, we fire our uploads as normal and the chunks upload respectively.

Once our video files have uploaded we send the user a process_id and store in the database and display a message showing the user the video is being processed.

Merging the files literally takes no time at all and uses and i quote [2020-08-11 14:35:17]: [INFO]: Memory peak usage: 22 MB. to merge a 2.5gb file from chunks.

$command = "cat"; for ($i = 1; $i <= $process->total_chunk_amount; $i++) { $fileLocation = '/chunks/'. $process->filename . '/'. $process->filename. '.' . $i . '.chunk'; if(!Storage::disk('local')->exists($fileLocation)) { Storage::deleteDirectory('/chunks/'. $process->filename); } $command = $command . ' ' . storage_path('app') . $fileLocation; } $finishedFilePath = '/chunks/'. $process->filename . '/' . $process->filename . '.' . $process->extension; $command = $command . ' > ' . storage_path('app') . $finishedFilePath; $exec = shell_exec($command); $this->logAlert($exec); $file = $this->pathToUploadedFile(storage_path('app') . $finishedFilePath); $saveFile = (object)$this->saveFile($process, $file, '/chunks/'. $process->filename);

I am not 100% sure of the security risks ( if any ) in doing it this way, and it is still very much a work in progress but it works and it works well.

Sorry for the delay its been a right headache doing this but we have made some ground.

Just to finish the story, when the video is merged we upload to S3 using a stream and once complete we update the front end with a web-socket ( already implemented for other things ) and re fetch the video from the correct location passed back from our response.

hope this helps

nerg4l commented 4 years ago

I can see the following problems with this approach:

A possible solution would be to have different merge drivers (or strategies). The default would be the currently implemented one which uses native PHP functions to merge the files. With an additional one which uses cat or type depending on the operating system and their availability.

pionl commented 2 years ago

Honestly for "security" issues I would not use cat and focus on native solution. I've not experienced issues on larger files but the code is not running in production so I can't say it 100% true. hope it works.

Thanks for letting everybody. Closing