In our FFprobe module it looks like we are accessing/downloading files multiple times. In the json_output method, we provide the location of the file to the ffprobe command so that ffprobe can pull the file and analyze it. This results in a download for internet hosted materials (Google Drive, Sharepoint, etc.)
Within that method we also call the valid_content_type? method in order to use the file's magic bytes to verify it is audio/visual before feeding it to ffprobe. This runs through the reader and open_uri methods in our FileLocator service. So we are downloading the file a second time as part of that flow.
I haven't verified but we are almost certainly downloading the file additional times to feed the file into ActiveEncode or as part of the ActiveEncode process.
In order to minimize how much we are hitting external APIs and to improve efficiency within Avalon, we should be caching the file in some way so that we only download it from these external services once (or as few times as is practical) during upload processing.
The file location URI for a network resource will save to that field in the masterfile model, and care should be taken to not overwrite this value with the temp file location.
Done Looks Like
[ ] Files are only downloaded via Browse Everything request a single time
[ ] Create a temp file and pass the location of the temp file into code flows
In our FFprobe module it looks like we are accessing/downloading files multiple times. In the
json_output
method, we provide the location of the file to the ffprobe command so that ffprobe can pull the file and analyze it. This results in a download for internet hosted materials (Google Drive, Sharepoint, etc.)Within that method we also call the
valid_content_type?
method in order to use the file's magic bytes to verify it is audio/visual before feeding it to ffprobe. This runs through thereader
andopen_uri
methods in our FileLocator service. So we are downloading the file a second time as part of that flow.I haven't verified but we are almost certainly downloading the file additional times to feed the file into ActiveEncode or as part of the ActiveEncode process.
In order to minimize how much we are hitting external APIs and to improve efficiency within Avalon, we should be caching the file in some way so that we only download it from these external services once (or as few times as is practical) during upload processing.
The file location URI for a network resource will save to that field in the masterfile model, and care should be taken to not overwrite this value with the temp file location.
Done Looks Like