Closed kernchen closed 2 years ago
I never did extensive performance measurements on mediainfo.js. I suspect it to run slower than the C program though. When you took the measurements, what exactly did you measure? The pure processing time or did you include the "warm-up", namely loading and instantiating the WASM file in the browser? I suspect the warm-up phase would take quite long.
Thank you for the prompt reply. I haven't checked a loading time for the WASM file. The test was probably not very scientific, just measuring from the point the file is opened till when the mediainfo was returned.
Happy to do some more measurements. Would you have a suggestion on how to just measure the "non warm-up" phase? We are not yet very familiar with the library, so apologies if I am not making fully sense yet.
BTW, we've compared times on the mediainfo.js page, with our Angular implementation. In the worst case we might be loading the WASM file every single time when analysing files sequentially?
The MediaInfo object gives you an instance of mediainfo which is ready to be used. It asynchronously does the WASM loading in the background.
MediaInfo({ format: 'text' }, (mediainfo) => {
// Here the mediainfo WASM file is loaded and ready to process data.
})
BTW, we've compared times on the mediainfo.js page, with our Angular implementation. In the worst case we might be loading the WASM file every single time when analysing files sequentially?
Yes, you should probably only instantiate the WASM once and then re-use it from that point on.
Thanks @buzz. We'll double check this and try to make sure that we load the WASM file only once and report on any results we get after that. That might take a day or two.
I (the main developer of MediaInfo library, used by mediainfo.js) tried with my own JS version and JS based analysis of a MOV/ProRes file is visually instantaneous.
Generally speaking, we expect now with up to date compilers and with modern browsers a x2-x4 impact when in JS vs native, not x100 :-p, which seems to us not normal.
Thanks @JeromeMartinez and @buzz. We have done some more testing now. To that extent we used the existing 'browser-multiple' example from the repo and added some timing and log outputs. We used about 24GB of different size media files from:
https://arriwebgate.com/en/directlink/e631dcb5b6ac8eb5
The updated code using momentjs for duration calculation is as follows:
const fileinput = document.getElementById('fileinput')
const output = document.getElementById('output')
// getting reference start time
let lastDate = new moment()
function get_file_info(mediainfo, file) {
let getSize = () => file.size
let readChunk = (chunkSize, offset) =>
new Promise((resolve, reject) => {
let reader = new FileReader()
reader.onload = (event) => {
if (event.target.error) {
reject(event.target.error)
}
resolve(new Uint8Array(event.target.result))
}
reader.readAsArrayBuffer(file.slice(offset, offset + chunkSize))
})
// reset reference time
lastDate = new moment()
return mediainfo
.analyzeData(getSize, readChunk)
.then((result) => {
// calculate duration to analyse file and print out
let duration = moment.duration((new moment()).diff(lastDate)).asSeconds()
console.log(duration + " for fileSize: " + JSON.parse(result).media.track[0].FileSize)
//Display outcome in HTML
output.value = `${output.value}${result}`
})
.catch((error) => {
output.value = `${output.value}\n\nAn error occured:\n${error.stack}`
})
}
async function onChangeFile(mediainfo) {
// reset reference time and print out timestamp
lastDate = new moment()
console.log(lastDate.format())
output.value = null
if (fileinput.files.length >= 2) {
for (let i = 0; i < fileinput.files.length; i++) {
file = fileinput.files[i]
if (file) {
await get_file_info(mediainfo, file)
if (i + 1 == fileinput.files.length) {
return
}
}
}
} else {
file = fileinput.files[0]
if (file) {
await get_file_info(mediainfo, file)
}
}
}
MediaInfo({ format: 'JSON' }, (mediainfo) => {
// log time it takes for MediaInfo to initialise
console.log(moment.duration((new moment()).diff(lastDate)).asSeconds())
fileinput.addEventListener('change', () => onChangeFile(mediainfo))
})
The log output with timings for different file sizes in Chrome is as follows:
example.js:71 0.205 <--- MediaInfo loading duration, from Script execution
example.js:47 2021-06-10T10:08:10+01:00 <--- Test reference start time
example.js:33 3.221 for fileSize: 736916512
example.js:33 3.476 for fileSize: 800478024
example.js:33 11.794 for fileSize: 2693398212
example.js:33 12.384 for fileSize: 2985889981
example.js:33 12.861 for fileSize: 3186999999
example.js:33 12.835 for fileSize: 3044725381
example.js:33 13.318 for fileSize: 2891182248
example.js:33 22.501 for fileSize: 5113988900
example.js:33 13.408 for fileSize: 3124161135
We noticed that the speed depends a bit on how much is running on the CPU in the background. But this was a fairly clean run and looks somewhat consistent with several timed runs. This was run on a 2019 MacBook Pro.
@JeromeMartinez, we did also use MediaInfo Website for testing the 5GB file and we had about the same results in terms of times.
@buzz, the WASM loading time seems negligible, staying consistently under 0.5s
Would you have any more thoughts on this? Must have something to do with the reading of the file, as seems linearly getting longer the larger the file size?
I think I spot the issue, looks like that a seek request (for going to the end of the file and reading the "header" of the file) isn't handled (it is with the desktop version) so the whole file is read (and data discarded) before reaching the expected file position. I can reproduce the issue, so the issue is in MediaInfo library, not the mediainfo.js binding. Will try to fix that, but no ETA (I am late with tons of other issues).
Thanks for your prompt reply @JeromeMartinez, once again. That is very helpful.
Just out of interest, would that be more of a general issue or just in the latest version? I.e., would a different version help temporarily resolve the issue?
Certainly interested to hear once you had time to look into a fix.
would that be more of a general issue or just in the latest version? I.e., would a different version help temporarily resolve the issue?
I can not be sure but I doubt, not something I recently modified.
Thanks once again @JeromeMartinez for now.
Hi @JeromeMartinez. I just wanted to check-in to see whether there would be any progress on this? Certainly understand your previous statement that you have all hands busy.
For planning purposes (a bit stuck at the moment with this one), is there any indication we could get, when this could be looked at? Are there any alternatives that could help to expedite this issue. Happy to discuss offline, if there is anything we could do / help with. Please just bear in mind we are not familiar with the inner works of the library at this point.
Any return appreciated.
Happy to discuss offline
Please contact us at info@mediaarea.net.
Fixed. Latest snapshots have the fix.
Hi @JeromeMartinez. I can now report that we built mediainfo.js from the latest snapshot and tested the fix and scanning is now near instant in a quick test compared to previous version.
We'll do some further performance testing as part of our project.
Guess next step for the mediainfo.js library will be to integrate with latest mediainfo version release that will include this fix.
Thanks a lot for your uncomplicated help on this one!
Could I ask which tag can be downloaded to the repaired version?
I downloaded the latest version(v0.1.6) Doesn't seem to have changed much
I downloaded the latest version(v0.1.6) Doesn't seem to have changed much
This is expected.
mediainfo.js uses the latest release version of libmediainfo (Version 21.03, March 26) but the fix is from Jul 14. The fix is not yet included in any release version.
The fix is not yet included in any release version.
Expected in few days :-p.
Very much looking forward to it, thanks
mediainfo.js Version 21.09, 2021-09-17 looks to be released. @buzz can this be incorporated?
mediainfo.js Version 21.09, 2021-09-17 looks to be released. @buzz can this be incorporated?
mediainfo v0.1.7 released.
Hi,
I've been doing some testing with regards to analysing media files, specifically larger ones and did compare times in the MacOS CLI version and using mediainfo.js in the browser on the same machine (Safari on MacOS 10.15.7, but a colleague also tested on Chrome).
It seems to me that execution times for mediainfo.js are significantly longer, i.e., a ~763MB Prores file takes ~3.5s to process in mediainfo.js vs. CLI version which takes 0.036s.
Maybe I am not understanding something, but what would be the reason of this and is there a way to speed this up in mediainfo.js.
We are basically trying to build a webtool to read meta-data for multiple files in a browser and our use case might not be viable if we not manage to improve the speed it takes to analyse files for meta-data.
Thank you for any thoughts or help in advance.