avalonmediasystem / avalon

Avalon Media System – Samvera Application
http://www.avalonmediasystem.org/
Apache License 2.0
93 stars 51 forks source link

Improve or remove `lsof` file checking #6069

Open masaball opened 5 days ago

masaball commented 5 days ago

Description

In our batch process we call lsof to make sure the files are fully uploaded to the dropbox before being ingested. This method is also utilized in the regular file upload flow so that only files that are fully processed will be shown in the dropbox listing.

This accounts for as much as 30-40% of the load time when rendering the file-upload-step page. Improving the efficiency of the check, moving the check to happen when opening dropbox instead of at initial page load, or removing the file check entirely would net us a significantly lower load time.

Done Looks Like

joncameron commented 2 days ago

Could there be a check against modification times on files, and then run lsof only on those? This could be a way to speed things up.

elynema commented 2 days ago

If we remove the lsof check on batch ingest in general, files could be truncated when ingested or could error on encoding. If there is an error on encoding, then that will go out in a report to the user post-ingest. If the file is truncated, there's no error to report. However, we think the moving the original file to ArchiverSpool to get sent to SDA happens post-encoding, so by that time it is much more likely that the entire file has uploaded and the full version is sent to tape for safe storage.