Closed chris-allan closed 4 months ago
Background for this change includes high performance scenarios where MAX_TABLE_DOWNLOAD_ROWS
is one or more orders of magnitude larger than the default (10000). Forcing the batch size to 1000 is incredibly high overhead in such a scenario.
I don't think the "lazy" loading is unused. The idea with the lazy loading is to remove (or allow much larger values for) the upper limit on number of rows when you're downloading a table as csv, since you're "streaming" the download.
Looking into this now, it looks like you can't actually download more than MAX_TABLE_DOWNLOAD_ROWS
at a time, since this is blocked by a the check_max_rows
flag. But if we were to remove the check on that, it should allow larger amounts to be downloaded.
I guess the optimal batch size when downloading would depend on how much data you can load at a time (not sure what the limiting factor is here - Ice Memory Limit?) and how many columns you have in your table.
It would allow more rows to be downloaded but would still lock the table, keep the service and session alive, block the OMERO.web worker, etc. It's not good enough "streaming" to have it be enabled and that's why we doesn't use it and you implemented #300 like you did.
We will be loading
MAX_TABLE_DOWNLOAD_ROWS
into memory and presenting it for download, it makes sense that batching happens at the same scale. With ome/omero-web#554 this is less of a concern but it makes sense to unify. In the near future it may be pertinent to remove the client side batching and "lazy" semantics entirely as they no longer really make sense or are completely unused./cc @erindiel, @sbesson