Closed acdha closed 9 years ago
It's actually still just creating one per bag ( https://github.com/loc-rdc/bagger-js/blob/introduce-react/assets/jsx/contents.jsx#L45 ). It's only hashing the first file at the moment since it's only being send one message.
Ah, yes, when I was looking at the diff I thought that was in FileRow
. It's good that it's already at the BagContents
level since that's probably where it'll need to go to manage the pool sanely.
Would this also improve the UI performance? Right now I am dropping in 39 mb and 2169 files/folders and the UI becomes unresponsive (can't scroll, can't click any of the buttons, etc). After a while (I guess when the workers are done) then everything works again as expected.
@johnscancella Possibly – that's definitely why all of the real work needs to be done in Web Workers. Simply updating the DOM isn't a problem until you get hundreds of thousands of elements but we'll need to be really careful about when are read into memory. If the refactoring which @eikeon did to pass in the File directly avoids that, it might be a major win
Interesting, with ~22K files there's one unexpected hotspot: getting the file sizes!
We should keep an eye on that and if necessary refactor that work into the worker since that's the only spot where we must open the file handle.
There's also one regression recently – we need to set table-layout: fixed
on the main table to avoid recalculating column widths each time we update the rows – but I was waiting on that until after making the hash display truncate nicely with a tool-tip so you can still see the full value for those long SHA-256 hashes.
@johnscancella and @acdha I'm assuming these tests are agains the main branch. It'd be nice to compare them to the react branch. Getting the file size has already been refactored into the worker in the react branch.
Previously we had one worker. In @eikeon's React branch, we have one per file. We should refactor this to have a limit so we'll spawn up to a certain number but wait for a worker to finish before starting another one.
It would probably be good to measure startup performance to see whether we should attempt to reuse workers across files. For small files the JIT startup performance might be enough to make that worth the modest additional effort.