Update Rows with and keep sorting active

Danielku15 commented 10 years ago

I am using SlickGrid to display real time data I get from a WebSocket. I use updateItem on the DataView to update the single rows according to the latest data. This works fine so far. The problem starts when you allow sorting of the items because calling updateItem does not ensure the row gets moved to the correct place.

An example usage could be a Web Task Manager. You sort by CPU load and of course if the CPU load changes, the high load Processes should move to top.

Of course I could simply call sort on the whole DataView but that would consume unnessecary processing power since the whole data is already sorted, only one item is at the wrong place. Is there a built-in way to tell SlickGrid that only this row is at the wrong place so it uses the current sorting to place it correctly?

I thought of creating my own DataView where I implement the updateItem in a way that does a remove+sorted insert but I'm not yet sure if this would make SlickGrid render correctly.

brondavies commented 10 years ago

Unless you have thousands of rows the processing required to sort client-side is minimal.

Danielku15 commented 10 years ago

I have up to 600 table entries and each second I get at least one update for each of those entries. This makes a total of N sorts of N elements per second (N²). In case of a merge sort ( O(n_log(n)) ) thats a total of N³_log(n) which is might become quite heavy. Doing a sorted insert (binary insert) + index lookup updating is would be way better.

Imagine viewing this table on a smartphone or a low budget PC. Not only you drain the battery, you also might have an unresponsive site.

I just implement my own DataView with this sorted updateItem behavior but it seems SlickGrid has some internal cache which prevents correct updating. Only invalidating all according rows will make them be placed correctly which takes additional load for recreating the contents.

[EDIT] I made a small jsfiddle to illustrate the concept of my setup: http://jsfiddle.net/7MaqW/ If you sort by CPU the sorting gets broken after update.

In addition here's a sample which calls sort on each update: http://jsfiddle.net/8hFW7/1/ If you check the CPU load (the real one on your PC not the sample itself) you'll notice that it takes quite a lot.

Last but not least: My current draft of the DataView which sorts on update. http://jsfiddle.net/7MaqW/1/ Still the CPU load is very high. According to the profiler it's because of the recreation of the CellHTMLs after invalidateAllRows().

brondavies commented 10 years ago

The reason your jsfiddle with the automatic sorting is taking so much CPU is because it's running a setTimeout in the startTimeout() function 500 times per second which you shouldn't have to do. Your workflow should be to retrieve the most recent data once per second and sort. If you're on a mobile-connected device, it should only retrieve data when requested manually - due to mobile network data usage concerns. however, if someone wanted to auto-refresh anyway, you could provide a checkbox to do that or something. See this fork: http://jsfiddle.net/HeaRJ/

Danielku15 commented 10 years ago

That setTimeout simulates the workflow of receiving data from a WebSocket. As said: Each of the 500 entries will send one update during this second. I want to use the table for displaying real time data. Of course I could build some sort of interval based solution which only updates each second but this would break the idea behind the real time system. Each event triggered, should be handled within a defined interval.

Beside the table I also have further UI elements displaying the data. If I update the table interval based, it is not up-to-date and will not match the other data displayed. An example could be: That a value jumps between to values (100% and 70%). If I update on an interval base i might only see 70% and will never see the 100% in the table. A gauge could display the max value of all entries and there you see a jumping 70%-100% but you'll never be able to find out which table entry it was.

This scenario should give you a good view what I'm trying to achieve.

Basically there should be some sort of API which marks a row as moved instead of invalid content. SlickGrid should detect if the "moved" rows are now outside the viewport and delete the HTML row in that case. If they are still inside, the HTML should remain untouched. If an item within or before the viewport was deleted or added the rows below should move up one row. If they are below nothing needs to be done. This would eliminate the need of recreating all rows by moving the existing ones.

brondavies commented 10 years ago

I see. Of course I can't comment on the real-time nature of the application you're working on but in my experience, unless it's a game, real-time data rarely means "data that is current within a fraction of a second" Slickgrid might not be a good choice in this case but I am also not convinced that it's sorting routine is causing the performance issue. Still, I would say that it's reasonable to dictate that sorting more often than once per second is not useful. A human would not be able to see it. If we take the task manager analogy, on any OS, even though it's running native code, it's not sorting the process list that often. Also, if you're going to get an update that often on practically every row, why would you re-order every time one of them changed? It might actually make it more difficult to see that one row's values have changed because it's moving around all the time.

Danielku15 commented 10 years ago

The problem is that the term "real time" is often misused combination with WebSockets. WebSockets are just reliable full-duplex channels for communication while real-time means that you have to process the data within a defined interval. Of course this interval can be seconds, hours or days as long it fits the application requirements. "real-time" doesn't mean as fast as possible, but reliable in time. According to DIN 44300:

"Real-Time Operation: The operation mode of a computer system in which the programs for the processing of data arriving from the outside are permanently ready, so that their results will be available within predetermined periods of time; the arrival times of the data can be randomly distributed or be already a priori determined depending on the different applications." - DIN 44300 9.2.11

I am currently writing my master thesis about real-time communication in web and in my case-study I am using SlickGrid for visualizing the data.

For now I am updating the sorting every second and that's enough for my current use case but overall it's another bottleneck within the real-time visualization system. In worst case the visualization is delayed by additional 1 seconds only because of the sorting.

JohnRanger commented 5 years ago

@Danielku15 , have you ever managed to optimize slickgrid (your personal copy of it) in the optimized way you have described it above? (As I was looking into this myself but did not succeed).

With kind regards,

John

6pac commented 5 years ago

strangely, we are just discussing this over at the 6pac repo: https://github.com/6pac/SlickGrid/issues/353

you should be using that repo - this one is dead - see slickgrid.net

Danielku15 commented 5 years ago

@JohnRanger No, I did not make any SlickGrid improvements as part of my thesis or our product. Since this issue was opened we are having a separate timer for triggering a sort. We are in the lucky situation that even though we have a lot of table rows (>1300 nowadays) with updates per row every ~5 seconds, our values do not make such huge jumps that it would require fully dynamic reordering.

mleibman / SlickGrid

Update Rows with and keep sorting active #938