OSC / ood_core

Open OnDemand core library
https://osc.github.io/ood_core/
MIT License
10 stars 28 forks source link

FileExplorer: Large File/Directory Download Max Size Limits #680

Closed sync-by-unito[bot] closed 2 years ago

sync-by-unito[bot] commented 2 years ago

Similar to https://github.com/AweSim-OSC/osc-fileexplorer/issues/2 we should place a limit on the directory or file sizes we let the user download. Past a certain point, they should be using sftp.osc.edu.

Downloading via file explorer was done via this pull request: https://github.com/AweSim-OSC/osc-fileexplorer/pull/102

┆Issue is synchronized with this Asana task by Unito

sync-by-unito[bot] commented 2 years ago

➤ Eric Franz commented:

In job constructor we try to determine the size of the directory first using this command:

https://github.com/AweSim-OSC/osc-jobconstructor/blob/0dfce4b6db5cae188983b5a8cc477cf2f5e80b09/app/models/filesystem.rb#L57-L59

sync-by-unito[bot] commented 2 years ago

➤ Brian L. McMichael commented:

I've successfully downloaded a folder of approximately 5.5 gb using the streaming zip.

Currently downloading a folder of approximately 60 gb. Will update on progress.

sync-by-unito[bot] commented 2 years ago

➤ Eric Franz commented:

Please add the time it took you to download both! Thanks for the tests.

sync-by-unito[bot] commented 2 years ago

➤ Brian L. McMichael commented:

I didn't track the exact time, but I'm getting a consistent 4.5 Mb/s over wifi.

sync-by-unito[bot] commented 2 years ago

➤ Eric Franz commented:

Of course download time will largely depend on connection speed. I wonder how much CPU the app takes during download of a directory vs upload of a file.

sync-by-unito[bot] commented 2 years ago

➤ Brian L. McMichael commented:

This is going to take some more testing. I have successfully downloaded zip files up to 8gb.

Resource usage during the process is minimal.

The problem is that the folder I am testing includes a ~50gb zipped VM file, and the zip fails on this file. The rest of the file contents are accessible, but this large file isn't. So I'm not sure if the server is choking because of this particular file, or it's because there's a limit to the amount we can stream.

This needs to be determined so that we can decide whether the limits should be set based on file sizes, or based on folder sizes.

sync-by-unito[bot] commented 2 years ago

➤ MorganRodgers commented:

We just had a user who managed to lock themselves out of OnDemand by triggering too many downloads on large directories (~12+ GB). I think that setting a limit of say 10GB is reasonable, and for directories larger than that refuse to download the directory and direct the user to use a dedicated SFTP client.