Tradeshift / tradeshift-ui

Tradeshift UI is a framework-agnostic JavaScript library to help Tradeshift App developers to create cohesive user experiences and to provide reusable UI components.
https://ui.tradeshift.com
Other
33 stars 44 forks source link

[Table] UI Table fails on large amounts of rows #326

Open tveon opened 7 years ago

tveon commented 7 years ago

Bug report

Use the bug label

Tradeshift UI version affected

v8.1.6

Expected Behavior

Loading 200.000 rows into a paginated table should be possible - as long as the client has enough memory.

Actual Behavior

Somewhere between 100.000 and 200.000 rows, you run into the following error:

ts.js:14951 Uncaught RangeError: Maximum call stack size exceeded
    at Object.populate (ts.js:14951)
    at $Class.$onconstruct (ts.js:14347)
    at $Class.$constructor (ts.js:1911)
    at new $Class (ts.js:2004)
    at Function.<anonymous> (ts.js:13963)
    at Function.from (ts.js:1640)
    at $Class.<anonymous> (ts.js:15118)
    at $Class.rows (ts.js:15015)
    at $Class.<anonymous> (ts.js:49868)
    at $Class.<anonymous> (ts.js:2647)

Steps to reproduce

let data = <load/generate large amount of rows - they can be duplicates>;
table.rows(data);

Feature request

Use the enhancement label

Description of feature

Since it might not be wise, to actual load this amount of data (we actually have a couple of millions of records), it would be nice to have a lazy way of loading the data - which also supports searching.

Using a custom pager to handle the amount of data, means that you also need to do searching in the backend - which is a huge pain and/or cost, depending on what approach you'd take.

Example use cases and/or Prototype links

We load e.g. 2 mio. document-records, and the user want only to see the ones, where the external document id contains "RH 73" and the senders name contains "auto".

Designs and/or Prototype screenshots

wiredearp commented 7 years ago

It's an interesting request, but I'm not sure that it should be possible to load millions of records into the Table and still expect all searching and sorting to perform great on the clientside in any client of unknown CPU power and memory configuration (and I'm also not sure how the data could be searchable on the client only without loading all the data into memory?). That is exactly why the Table can be configured to run in a "serverside" mode where the Pager and the "pages" are configured manually be the developer: So that the Table can support a theoretically unlimited amount of data and/or perform advanced searching and filtering and sorting. It may be a "huge pain" to search millions of products in the database, but since databases are usually optimized for this scenario, I would argue that pain is almost always bigger in the browser. The question is of course how much data the Table can be expected to handle as a general guideline, but I would set that limit much lower than the quarter of a million rows (with an unspecified amount of cells) that you guys have managed to mount, because you're gonna crash GMail and Soundcloud and YouTube in the other browser tabs long before you get to that point, especially if you are on a mobile phone or some kind of cheap ass notebook that they use in accounting. Of course, if and when we can manage to end support for all versions of IE, we can start experimenting with WebWorkers (for sorting and searching) and ServiceWorkers (for lazy searching!) to enhance the performance of the Table, but it doesn't make much sense to go down that route just yet, because we would just end up with a Table that crashes IE or at the very least presents the user for the "script on this page is causing the computer to run slow" dialog. So I think that (at the moment) this should be a documentation effort instead of a technical task, so that we simply 1) explain to users that the Table will hit a performance wall in the client and 2) show how the fullscreen Table can be setup to trigger a Pager refresh when the browser window changes size (because that is really the challenge with "fullscreen" versus "serverside" Tables).

tveon commented 7 years ago

I don't expect it to work great - just that the table doesn't crash. Also - nice wall of text. Perhaps throw in a line-break once in a while - just saying...

wiredearp commented 7 years ago

I actually don't think it really crashed, just that some code was called so many times that the browser inferred an infinite loop and decided to crash itself. Perhaps the Table would support more data on a bigger development machine, but then it would certainly support less data on a mobile phone, so while we could certainly try to make the Table support more data [1], it cannot be sustainable in the long run to simply offload the entire database onto the client to perform all kinds of searching and sorting here. There is after all also the question of how many megabytes you really want to send to a mobile phone when the user can never be guaranteed to press that Next button in the Pager :question:

[1] Now that we know that people are actually using the Table in a clientside"mode (which we never figured when we created the Table, it just felt like something that a "ui component" was expected to do), we will surely attempt to optimize all this at some point, but I still don't think that it makes sense as long as we support browsers without WebWorkers, because even if the browser didn't crash, it would make IE so unresponsive that the user would assume that it had.

We should in any case leave this enhancement issue open so that tension can build around the subject if and when others run into this limitation.

wiredearp commented 7 years ago

:see_no_evil: The exact same issue was reported the following day in the (duplicate) issue https://github.com/Tradeshift/tradeshift-ui/issues/330.

I am leaning towards putting an upper limit of 50.000 rows or something in the Table. Then this figure can be documented explicitly in the Docs website; and we can elaborate on it further in a console.warn message in case the developer hasn't read that documentation. I figure that if we optimize the Table to handle 200.000 entries, then we will just get the same enhancement request for 2 million entries and we will then basically just path the way for some much bigger problem: For ourselves, because we now have to also optimize the searching and sorting algorithms and these are very very hard to optimize (without multicore and Worker support); and for the user, because the Table will begin to consume way too much memory, not least on a mobile phone, where the data-traffic that carries all these millions of product entries must also count for something.

cc @sampi and @zdlm

wiredearp commented 7 years ago

I found an equivalent recommendation in some random DataTable library I found on the internet over on https://datatables.net/manual/server-side where it says that

DataTables' server-side processing feature provides a method to let all 
the "heavy lifting" be done by a database engine on the server-side 
(they are after all highly optimised for exactly this use case!), and then 
have that information drawn in the user's web-browser. Consequently, 
you can display tables consisting of millions of rows with ease.

If you search for the string "maximum" in the search-field on that website, you can find some stories about that particular implementation breaking down at somewhere between 10.000 and 20.000 rows (adding from 10 to 20 seconds of initial parse time and throwing the exact same "Maximum call stack exceeded" exception). So I think that we can be justified for recommending the serverside solution for big-ish Table structures, provided of course that we properly document this on the website.

tveon commented 7 years ago

We do try and do as much of the filtering on the server-side as possible. But databases are not optimised for full-text search - and particularly not on "any column". I think, what we would need/like. Is some sort of mechanism to load additional data into the table, which is not tied to the pager. Doing this in the frontend is not really my field of expertise, so I can't make a more concrete suggestion, sorry.

wiredearp commented 7 years ago

I would argue that this depends on the database that you are using, since Google for example seem to have come up with a solution for searching every single word on the entire internet. But it is in any case not something that JavaScript, being a notoriously single-threaded programming language, is optimized any better for. I am sure that you would also not choose to perform a search of this scale in Java via Hybernate mappings, even though that would perform comparatively much better.

I am not clear on the idea about loading more data into the Table without updating the Pager. Since you are in control of the Pager (in a "serverside" Table), you can just choose to not show it at all, if that helps? If your goal is still to have the Table manage everything on the clientside, perhaps you can try as a workaround to load the data incrementally, say 50.000 rows at a time. You would then

table.rows().push(row1, row2, row3 ...);
setTimeout(() => table.rows().push(row50001, row50002, row50003 ...));
setTimeout(() => table.rows().push(row100001, row100002, row100003 ...));

... instead of table.rows(everything) all at once. If that helps, then we can attempt to make the Table perform this stunt automatically, although it will not fix the much bigger problem of searching and sorting in a datastructure that as way to big for any browser to handle, but at least it might then render without crashing.

wiredearp commented 7 years ago

Event if JavaScript was faster than Elasticsearch when it comes to searching for stuff, it would probably still not be the recommended route to follow, since the browser becomes so stressed out while searching and sorting that it cannot even animate some kind of animated .gif. The user experience would become terrible and there is a good chance that the user will simply close the tab or the entire browser (which on all operating systems will not even happen before the browser is done crunching unless you force exit the process somehow). That's why I suggest that we should deep freeze this enhancement project until we are sure that all out browsers support Web Workers (and is basically not Internet Explorer in any version including 11).