Closed amir20 closed 2 years ago
@amir20 Just wanted to say thanks for this tool, was pleasantly surprised when I refreshed my docker stack and seeing JSON support.
@amir20 OK seeing elevated CPU Usage. My Chrome tab crashed twice.
@amir20 OK seeing elevated CPU Usage. My Chrome tab crashed twice.
I wasn't able to reproduce this. I am running 10 containers and my cpu was below 2%. Can you help me debug? When does the cpu spike happen? Are you running very active logs? Thanks.
Ok, I was able to reproduce a few times, simply reloading the page and letting it run in the background. Takes about 5min, and then the tab freezes and goes into high CPU load. It is a fairly voluminous log stream from my Postgres instance (~5-10 lines per second), but this didn't happen with the v3 and I didn't change anything on my side since v4. @amir20
Oh and I have about 12 containers running, but the PG one is the most active.
I see. That's a known issue. It was actually happening on v3 too but just took a little longer. It's just exacerbated in v4 because I happen to save a little bigger objects per log.
It has been discussed a few times. I don't know what would be a good fix. The problem is that all these logs are saved in memory and eventually become too big.
I can think of a couple solutions.
What do you think? I think pausing and resuming is the easiest for the user but that mostly fits my own use case. Not sure for others. 🤔
Ah ok I see.
Stopping the log might hurt the UX a bit. For me, I like to come back to the tab, and see the latest right there. If the logs stop coming, I would end up refreshing the page to get the logs within the last 2-3 minutes. Unless, when the user comes back, the app auto fetches the gap since tab was put in background? I think that might work.
blur
, app saves timestamp for container in view (selected).focus
, app uses saved timestamp and auto refreshed the view with the latest log lines.I'm not sure how refreshing works (as in selecting a different container to view most recent log lines), would that essentially purge old logs for the selected container? Also, if the user blurred the window viewing older logs, logic should be smart enough to not clear the state, and know if a refresh is really needed (since they're viewing old logs, not needed).
Thought about it some more, and I guess long term that wouldn't work, because the logs are still held in memory. When the user comes back and the app pulls the latest, it would get back to that freezing scenario. Plus you have to deal with the cursor (where the user is looking in the log), scrolling in the past or in the future...
So the purge would be the way to go for me. Can you explain the complexity?
If I were to approach this from a clean slate, for each line that comes in the EventStream
, I would save two pieces of information in a table/object (maybe with dexie?):
The combination of the two could form a sort of unique primary key of the log line. Saving those somewhere would allow us to know which lines to purge (by timestamp or max number of lines per container), allowing us to target exactly those. I'm sure I'm missing the cons in this :).
Another thought would be to push this to the backend? App could pull on demand as user scrolls? I think that's how papertrail does it. Might become a little slow though...
@flarco That's exactly right. The problem is actually more complicated than you think. Just heads up this might be a long comment. :) But sounds like you are the only person who is understanding the complexity so far.
When the user comes back and the app pulls the latest, it would get back to that freezing scenario.
Yep, this is a problem. I don't think I would resume from where it was suspended. Instead, just resume. Also, I wouldn't use window deactivate but rather visibilitychange as someone can have Dozzle in a different screen.visibiltychange
only triggers if it is hidden like a behind a tab.
So the purge would be the way to go for me. Can you explain the complexity?
Let's say the browser is holding 1000 logs. Like an array [0...999] and the window size is 100. So when the browser is all the way scrolled to the bottom it is showing [900..end]. When the user scrolls to the top, they are looking at [0...100]. In your example, if I was to remove the oldest logs (because I would only want to retain 1000 items for example) and the user is scroll to the top then it would start deleting logs as they are seeing them. In fact, they could be looking at any part of the [0...999] and I have no way of knowing if they are scroll up. Unless, I do some kind of js magic.
I hope that makes sense.
I have been looking at dexie, sql.js (WASM + SQLite and a few other options. I think I just wanted an excuse to use WASM.) But after a while, I think an embedded database is an overkill. For this use case, I can just remove the 100 items using something like array.split()
in js.
Another thought would be to push this to the backend? App could pull on demand as user scrolls? I think that's how papertrail does it. Might become a little slow though...
I do actually load more logs as you scroll up. This is why it makes it even more complicated. If there are already 1000 items in the buffer, then scrolling up would be even more breaking because they would be oldest.
Here is what I THINK should happen. There are essentially four states.
For each of these they need to be handled differently.
Again, more complicated than I had expected, but I really don't know how else to make it simple and not confusing to people using Dozzle. It should be seamless and no one should notice Dozzle is throwing logs away.
What do you think?
Nice, thanks for the context.
The states you listed reflects my understanding as well. But I think things could be improved for 2 and 4.
The issue with suspending is that when you come back and scroll down, there would be a gap. And also I don't think state2 is such an edge case, I cross it all the time. Because I could scroll backwards, even just slightly, so that the cursor isn't at the "live" edge point, as I am reading lines. Then I could switch away to another tab/window for some reason, entering it in state 4. Then come back and forth, scrolling just slightly, reading logs. So at this point it's basically many switches between 2 and 4. And thinking of it, as you said, it's quite important for debugging, so it's important that both 2 and 4 work flawlessly for good UX.
So let us consider the following state variables for a container (admittedly a bit simplified :):
startLineID
: the earliest log line loaded in memory endLineID
: the latest log line loaded in memorycursorLineID
: the line that the user is currently viewing (the start of the window)maxLogLines
: the maximum number of lines that can be loaded in memory for a container, which endLineID
- startLineID
should not exceed.thresholdLines
: a minimum delta number of lines between cursorLineID - startLineID
and endLineID - cursorLineID
.Combining this with your array example [0...1000]
, startLineID
would be 0, endLineID
is 1000, cursorLineID
can be anything between 0 and 1000.
So here's how this plays out in my head, as log lines are streaming in:
maxLogLines
is 1000, and that loadThresholdLines
is 50endLineID
is continuously changing. maxLogLines
is reached (endLineID - startLineID
), purge needs to occur on either end, depending on value of cursorLineID
, when a certain threshold is met (which could be startLineID + thresholdLines
or endLineID - thresholdLines
). With this, we could say:
cursorLineID
equals endLineID
, user enters state1/3, app continues to receive stream, cursor "sticks" to endLineID
to auto-scrolls down, always viewing the latest line. If maxLogLines
is reached, purge occurs at the startLineID
side.cursorLineID
is less than endLineID
, user enters state2/4, continue receiving stream.cursorLineID
is equal or less than endLineID - thresholdLines
, stop receiving stream.cursorLineID
changes (user is scrolling), no need to do anything, as long as cursorLineID <= (endLineID - thresholdLines)
and cursorLineID >= (startLineID + thresholdLines)
is satisfied.cursorLineID
becomes smaller than startLineID + thresholdLines
, purge occurs at the endLineID
side and load earlier lines at startLineID
side (as is already implemented). And inversely, if cursorLineID
higher than endLineID - thresholdLines
, purge occurs at the startLineID
side and load later lines at endLineID
side. All this occurs while satisfying maxLogLines
.Like this, the memory load is managed as user scrolls up and down, and UX is kept intact. Thoughts? I haven't studied the code base, so I don't know how feasible this would be.
Great. We are making progress. I do agree that my state 2 has a gap. But after reading yours, I think I found a few gaps too that I wonder if you thought about them. Also, the technical limitations are a little complicated.
Let me reply inline.
And thinking of it, as you said, it's quite important for debugging, so it's important that both 2 and 4 work flawlessly for good UX.
Ok I see your point.
re: cursorLineID
This is actually harder than you think. Mapping a particular pixel to an item in an array isn't that easy. I currently know if the container is scrolled all the way via clientHeight
. But to map each <li>
in logs, I would have to use something like IntersectionObserver
to observe when each element comes in to view. Actually pretty hard to do. I currently use that method to look for start and end for loading more and auto scrolling.
But let's assume that this is possible to continue on this solution.
If cursorLineID equals endLineID, user enters state1/3, app continues to receive stream, cursor "sticks" to endLineID to auto-scrolls down, always viewing the latest line. If maxLogLines is reached, purge occurs at the startLineID side.
So I think we agree that for both of these states, we suggested the same solution. In your solution you are using some threshold, but aside from that, if user is at bottom, trim top.
If cursorLineID is equal or less than endLineID - thresholdLines, stop receiving stream.
If the user is in the middle, and Dozzle stop streaming, wouldn't it be confusing after the user scroll down. Essentially there might have been a big gap in the logs?
If cursorLineID becomes smaller than startLineID + thresholdLines, purge occurs at the endLineID side and load earlier lines at startLineID side (as is already implemented). And inversely, if cursorLineID higher than endLineID - thresholdLines, purge occurs at the startLineID side and load later lines at endLineID side. All this occurs while satisfying maxLogLines.
If the user scrolls to the top, and maxLogLines
is exceeded then wouldn't it be confusing after the user scrolls to the bottom and there is some missing logs at the bottom.
I think in my solution, I would never trim the most recent. Because as the user, (and I think you said this too) I would expect it the stream to be one continuous set of logs without ever having a gap.
What do you think?
I think your solution has two differences with mine:
scrollHeight
too) This might be a good solution for your use case when just go above the fold but a little but don't expect to stop streaming. Also, perhaps we just solve for the most common use case. That would be most people autoscroll and maybe scroll back a bit like your threshold.
In that use case, we just trim at top.
If that works, then ship it and worry about the edge cases later. 💰
I don't think we are seeing differently, there should never be a gap.
When Dozzle stops streaming (when user is in middle), and say the user scrolls down again, Dozzle should pull the subsequent lines, as in, fill the gap (via REST call), and as user keeps scrolling down, keep pulling, at a certain threshold until the user reaches the "live" point (where now the stream is turned back on). Top is trimmed as needed if maxLogLines
is reached. Hope that makes sense.
I didn't know about the difficulty of mapping each <li>
to as particular log line, so totally understood that's a limiting factor.
But yes, lower hanging fruit / quickest value is just to trim the top. Sounds good. 👍
Oh I see. It would resume from where it paused. I didn't get that. Let me think about it.
@flarco I like your idea of threshold. Initially, I will implement a very simple logic. If not scrolled to top within 3 x window_height
then drop the oldest logs.
I did some profiling and looks like the sweet spot is actually around 1100 items
before the memory jumps above 400MB
.
So maybe I'll just cut the logs around there.
Well that was a fail. 😥 If the user is anywhere but the bottom, it will keep trimming the top until the current position becomes the top, at that point it will just exceed the size of the buffer. I don't know why I didn't think of that.
I think at the very minimum both sides need to be implemented.
When the buffer is FULL...
The hard logic is resuming. Because I can't resume until user has gone close to the bottom. At that point, it might have been 10000s of logs. I am not sure how that would perform. It would stream all logs passed the scroll and also trim the top. That might be funky.
🤔 Let me know @flarco if I am making it more complicated than it needs to be.
🤔 @amir20
I'm trying to understand why it keeps trimming the top if the user is not at the bottom. I mean, if the user is at the middle, the stream should stop. And it should only trim (either on top or bottom) if the buffer is nearing full. If the stream stops, you're not loading any new lines, so the buffer doesn't get exceeded, right? If fact, the buffer shouldn't change at all if user stays within bounds of loaded log lines in memory.
Ideally, if user is at middle and then later scrolls to top, it should load previous lines, and trim at bottom (if necessary, if the buffer is full). Same way, if user later scrolls to the bottom, it should load subsequent lines, and trim at top (if necessary, if buffer is full). Makes sense? Not sure what I'm missing. And if user keeps scrolling down, it should keep loading incrementally, until reaching the very bottom (the live point), at which point the stream should turn back on.
So I think it should perform just fine, even if it's been a while since the user scrolled, and then user scrolls down to bottom. It doesn't need to load all the new lines since, just a batch at a time, like 500 lines, and trim the top as needed. And maybe have a button (like that green one on the bottom right that is already there), if the user clicks that button, go to live mode (maybe just reset the view, trim all, and load from live stream again).
What am I missing? I may be making it too complicated, though, at least for now. Cause that would be a fully featured docker log scanning tool, very nifty. 👍
I think the part you missed was comment https://github.com/amir20/dozzle/issues/1852#issuecomment-1221602705 which was to implement just the top trimming. But that obviously broke :(
Here is a couple things I have noticed with my prototype. Which when it is as a good place I can let you test it.
The trimming at top needs to only happen when auto scroll at the bottom is on. Basically no threshold. Why? Because if the browser isn't scrolling automatically, then top of the buffer will be removed and slowly the user will scroll up as the top is trimmed. I hope that makes sense, but it took me a while to understand what is going on...This wouldn't work as the user will jump around.
Scrolling down is also a problem. Imagine this:
I feel like it should just resume and show a marker or something. That way the user (similar to live idea) can click if they want to load more between the gaps. 🤔
Ah Ok, yes I did miss that :).
Ok, so just trimming the top. All right, well the pausing/resuming needs to work right. Cause if you pause, auto-scoll should disable and user is looking at lines in buffer, there should not be a need to trim since it's paused. Cause yes, if it keeps trimming, and you're eating the beginning and the user will slowly scroll up until the beginning of the log.
Now later, when user scrolls down, and it shouldn't need to load all the lines at once until live. Just in small batches? I think that's where you'd need a new endpoint right? to load lines from tsStart
to tsStart + 60s
? Until the live point is reached, where the stream would come back.
The trimming at top needs to only happen when auto scroll at the bottom is on
Agreed
I scroll down and immediately the scroll will jump adding a bunch of new items. This feels unnatural as the user is jumping around in the scroll.
I'm not getting that part, why would it jump around? It would be as if new <li>
lines are being appended to the element. Does that cause a jump effect?
Let me see if I can connect the dots...
Coming back to the use case:
Options 1
This solution is probably the most complicated because just imagine all the work I would have to do load the data incrementally while keeping tracks of the stream. :/ And even worst what if the user scrolls back up while this is happening.
Option 2
This is simpler but still complicated for the user as they just saw a bunch of logs fly through the screen. They have lost their original position.
Option 3
Date.now()
and displays a message like "Skipped 145 logs. Click to show"
I think option 3 is the easiest and least confusing to the user. And note this is only true for really high volume logs.
Of course, none of this stuff might matter once I prototype anything. It might be just be confusing to do option 3. I am just trying to save myself work 🤔 by picking the least complex path
Also, I haven't played with Paper Trail for a while. Maybe worth checking what they do.
Ah ok I see now. Thank you for explaining. I didn't know that you could not maintain the same view , or position in the huge log list while trimming and/or loading, so it would jump. Yes, less than ideal UX. Hmm, interesting.
But at the moment, when the user goes to the very top, dozzle loads the previous log lines and the view jumps a little while sort of maintaining the same position, doesn't? I did it today and it didn't stay on top after load. How do you do that now?
Overall, agreed that option #3 is that easier way to go ATM. And if it says that it skipped ### lines, you could have a button to reload the whole thing (similar to choosing a new container) so that the user can just scroll up to see those skipped lines.
But at the moment, when the user goes to the very top, dozzle loads the previous log lines and the view jumps a little while sort of maintaining the same position, doesn't? I did it today and it didn't stay on top after load. How do you do that now?
That took a lot of math and calculation to get right. I had to save the position and restore. However, that's for scrolling up.
What we are talking about is different. Because now I have to trim the top and load data at the bottom while remembering the position. That's just complicated.
I'll keep playing with a prototype. I am only getting less than an hour on it each day.
👍 thanks man I'd help out but my Vue days are over. I'm happy in the React camp now 😄 . JSX ftw. I can help on the Go side though, let me know. Frontend with good UX can be so hard.
Well ain't that funny. Papertrail does exactly option 3.
@flarco did you see the other message on #1860 ? Let's move the conversation there.
Yes, sorry I'm out of town till Monday. Will get to it then.
Alright closing this since no one has complained about JSON. Looks like it is working for all.
I have officially released the next major version of Dozzle. 🎉 Dozzle v4 has full JSON support for logs. JSON keys can be hidden based on container for easier debugging.
This version had some major rewrites so I am sure there a bugs. Please use this issue to track any bugs.