Closed amir20 closed 2 weeks ago
@FrancYescO I created a new issue.
I first tried to reproduce this. I was able to reproduce it with my own custom script. I produced logs every 100ms and then every 2 seconds, I produced 10000 logs. I see the UI freezing a lot.
I am not sure what can be improved. I see some people talking about moving the parsing to a web worker. I am going to see if that's even feasible. I am not of any way to see what is freezing the UI.
But being able to reproduce it is half the problem. :)
You can still take portainer as inspiration, haven't looked exactly how they are presenting the logs, but also with lot of logs to present I never seen a freeze of the ui
I'll check out portainer. But I have a feeling portainer uses a textbox which causes less dom updates.
I did try something interesting. I replaced parseMessage
with just a dummy implementation that always returns the same log. So I removed all JSON parsing. In my test, I did still the browser freeze. Which means the problem isn't in the JSON parsing but actually adding those HTML elements to the DOM.
I am not sure if there is a way around it. I imagine adding 4000+ items to the HTML dom is just intensive.
Even using a virtual scroller wouldn't work because while the user is tailing, it would all need to be refreshed. I think it doesn't matter how much I optimize, the load is just so high that appending these many logs to the DOM is CPU intensive.
I checked out Portainer. It doesn't use a textbox. But it only shows the last 100 lines, which isn't very helpful.
Maybe having a virtualscroll would help. I have tried adding virtual scroll before but it didn't go really well.
I quickly prototyped a virtual scroller. It's definitely better. But there are so many other bugs with spacing and height that I think it's beyond the time I can spend on it. I'll keep this open.
To summarize, the issue isn't with parsing of JSON, but rather the DOM is getting so large in bursts that is causing a lot of blocking.
You can change the row to load in portainer on top, is slow to load but it does not freeze the ui also with 100.000
Are you adding to the DOM line per line? Maybe some sort batch process than add can help (someting like delay the add to dom for 100ms waiting for another message to came)
You can change the row to load in portainer on top, is slow to load but it does not freeze the ui also with 100.000
That does freeze the UI for me. Scrolling some how works but all the buttons are not interactable for me. Clicking the timestamp also freezes for a few seconds.
Are you adding to the DOM line per line? Maybe some sort batch process than add can help (someting like delay the add to dom for 100ms waiting for another message to came)
That's what it does. It buffers up to 1 second and then adds to the dom as batch. The adding part is done via Vuejs though. I believe it does it all in one go and it is fairly performant. The reason why virtual scrolling probably helps is because if the user is only looking at the last 20 rows then no reason to append the other (x - 20) rows to the dom.
I just don't think making the scroll virtual is worth it as so many other things broke when I do it. And it's not like it's a list of all equally sized components.
I wish someone was an expert in DOM or Vue and could help. 🤷🏼♂️
@FrancYescO When you get a chance can you try https://github.com/amir20/dozzle/pull/3227? I just did random optimizations. Not sure if it would make a difference but I tried trimming the buffer sooner. You can try it with amir20/dozzle:pr-3227
. Based on my tests I saw browser freezing less.
I think there might be a sweet spot where it's good enough.
Surely is looking better, to be clear i'm still having freezes like 15s when loading previous messages and 5s when in live mode but this surely depends on the volume of messages that are coming (or maybe, on the amount that are visualized/loaded in the ui?), at least seems i never falled in a totally unresponsive tab that should be closed like before...
ps. a little indication on the ui that we are in "live mode" can be useful: pretty often i scroll a little up and after a bit that i see no message coming i go down and i get the "Skipped xxx entries" that will cause me to lose that logs as the only way to retrive is a refresh (and make me realize that i was ignoring new messages)
ps2. did you have tried to remove the batch processing? maybe lot of smaller update is better than a single one with lot of content
The problem is that I can't really reproduce the 15s and 5s pauses. I have seen a few seconds. So anything I try is just a best guess. I am using Chrome on Macbook Air with M2.
For now, let's focus on the live mode since I think that's the most common use case for folks.
I have a test similar to your logs. I setup a test.log
.
❯ cat test.log | cut -d. -f 1 | uniq -c 15:09:45
4620 2024-08-25T00:06:35
10 2024-08-25T00:06:36
5010 2024-08-25T00:06:37
10 2024-08-25T00:06:38
5010 2024-08-25T00:06:39
9 2024-08-25T00:06:40
5010 2024-08-25T00:06:41
10 2024-08-25T00:06:42
5010 2024-08-25T00:06:43
10 2024-08-25T00:06:44
5010 2024-08-25T00:06:45
9 2024-08-25T00:06:46
5010 2024-08-25T00:06:47
10 2024-08-25T00:06:48
3505 2024-08-25T00:06:49
1515 2024-08-25T00:06:50
393 2024-08-25T00:06:51
4626 2024-08-25T00:06:52
10 2024-08-25T00:06:53
5010 2024-08-25T00:06:54
10 2024-08-25T00:06:55
5010 2024-08-25T00:06:56
9 2024-08-25T00:06:57
5010 2024-08-25T00:06:58
10 2024-08-25T00:06:59
5010 2024-08-25T00:07:00
10 2024-08-25T00:07:01
5009 2024-08-25T00:07:02
10 2024-08-25T00:07:03
5010 2024-08-25T00:07:04
10 2024-08-25T00:07:05
5010 2024-08-25T00:07:06
10 2024-08-25T00:07:07
5010 2024-08-25T00:07:08
9 2024-08-25T00:07:09
5010 2024-08-25T00:07:10
10 2024-08-25T00:07:11
5010 2024-08-25T00:07:12
10 2024-08-25T00:07:13
5006 2024-08-25T00:07:14
I do get some unresponsiveness.
a little indication on the ui that we are in "live mode" can be useful
Agreed but let's come back to this later.
ps2. did you have tried to remove the batch processing? maybe lot of smaller update is better than a single one with lot of content
I think you are right. There maybe some improvement. The batching right now is done with a timer with up to 1 second. So theoretically it can have 10K items which isn't great. I should probably flush the batch with total number of logs too.
Here is what I am going to try:
I'll let you know when I have something to test.
Let me know if there is a better test for me to try to reproduce your scenario.
So it turns out moving batch size to a smaller number actually makes performance worst. Because then it needs to flush 5x per second which isn't great.
I made more improvements but most notably, I just replace the buffer with the latest messages now which seems to improve a lot.
Try the latest. Also look at cat test.log | cut -d. -f 1 | uniq -c
above. I think that's pretty close to your set up. I guess the only difference is that I don't have JSON just simple logs.
Based on my testing, this looks really good so far. No freezes at all for me.
Nice, going to do some tests in ~14h
i can confirm that now in live mode is a lot better, when the burst arrived i get a <1s freeze, but is totally acceptable
the culprit i'm pretty sure is in the
parseMessage
func, but idk how this can be optimized avoiding locking the main presentation thread freeze the full browser tab while parsingOriginally posted by @FrancYescO in https://github.com/amir20/dozzle/issues/3122#issuecomment-2305159542