Fujio-Turner / sg-log-reader-demo

Parsing and Aggregating Sync Gateway Logs
https://fujio-turner.github.io/sg-log-reader-demo/
Apache License 2.0
5 stars 1 forks source link

async Process ending #36

Open Fujio-Turner opened 5 months ago

Fujio-Turner commented 5 months ago

Looks like as the sg_log_reader.py get to the end of the file it keep trying to read items past the end of the file that don't exist.

I think the solution is to as it process near the end of the file to tell the script only go so far so that you don't go past the end of the file.

Fujio-Turner commented 5 months ago

I noticed it both in async range errors and it not processing valid WS Id near the end of the file

Fujio-Turner commented 1 month ago

Its less of a bug and more of the script crash b/c its not clear how to handle processing sync/blip logs near the bottom of the log.

Example if you have a 100 log line sg_info.log the script will reach after the last WebSocket transaction ID ,lets say on log line 20, its told to scan or process the next 50 after the line 20 ,to line 70, in case the _changes feed has more info for you.

But what happens with the last log line is at line 90 and you still tell it to process the next 50 lines.

So the script needs to get smarter about how close it is to the end and tell it to only process to in the above example 10 more lines.

Fujio-Turner commented 2 weeks ago

To handle the scenario where you're near the end of the file but still need to process a certain number of lines or a percentage of lines, you can modify your logic to:

  1. Track Total Lines: Keep track of the total number of lines in the log file.

  2. Dynamic Line Depth: Adjust the number of lines to process based on your current position in the file.

Here's how you might refactor your code:

Step 1: Count Total Lines

Before processing, count the total number of lines in the file:

def count_lines(file_path):
    with open(file_path, 'r') as file:
        return sum(1 for _ in file)

class SGLogReader:
    def __init__(self, configFile):
        # ... other initialization ...
        self.total_lines = count_lines(self.log_file_path)  # Assuming you have a log_file_path attribute

Step 2: Adjust Line Depth Logic

Modify the part where you process lines:

def process_log(self):
    # ... other code ...

    passIt = 0
    max_lines_to_process = min(self.logLineDepthLevel, int(self.total_lines * self.logLineDepthPercent))

    while True:
        logLine = next(self.log_file)
        # Process log line logic here

        if some_condition:
            # Your processing logic
            passIt += 1
        else:
            passIt += 1

        if passIt >= max_lines_to_process:
            # Here you might want to check if you've reached the end of the file
            if self.log_file.tell() >= self.total_lines:
                # If you've reached or passed the total number of lines, break
                break

            filterByChannels.sort()
            changesChannels = sorted(changesChannels.keys())
            return [logLine, since, channelRow, queryRow, filterBy, filterByChannels, blipClosed, blipOpened, continuous, conflictCount, errorCount, warningCount, sent, pushAttachCount, pushCount, attSuc, changesChannels, pullAttCount,pushProposeCount]

    # If you've reached here, you've processed all lines or hit the end of the file
    filterByChannels.sort()
    changesChannels = sorted(changesChannels.keys())
    return [logLine, since, channelRow, queryRow, filterBy, filterByChannels, blipClosed, blipOpened, continuous, conflictCount, errorCount, warningCount, sent, pushAttachCount, pushCount, attSuc, changesChannels, pullAttCount,pushProposeCount]

Key Changes:

Additional Notes: