ppalucha / ksar2

Fork of ksar - a sar grapher
Other
29 stars 15 forks source link

Make Graph.parse_line handle ArrayIndexOutOfBoundsException for broken lines better. #16

Closed ams-tschoening closed 6 years ago

ams-tschoening commented 6 years ago

I have one sar file containing the following lines of data, where the data for column %gnice is missing for some unknown reason. It's hundreds of lines of CPU usage and only those two lines are broken, stopping parsing of all other lines. This is a problem because one only gets an exception for the problem itself, not in which line the problem occurs or such, so it's difficult to debug this.

12:00:01 AM     CPU      %usr     %nice      %sys   %iowait    %steal      %irq     %soft    %guest    %gnice     %idle
12:00:01 AM     all      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
12:00:01 AM       0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00    100.00
12:00:01 AM       1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00    100.00
12:00:01 AM       2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Exception in thread "Thread-2" java.lang.ArrayIndexOutOfBoundsException: 12
    at net.atomique.ksar.Graph.Graph.parse_line(Graph.java:110)
    at net.atomique.ksar.Graph.List.parse_line(List.java:53)
    at net.atomique.ksar.Parser.Linux.parse(Linux.java:261)
    at net.atomique.ksar.kSar.parse(kSar.java:148)
    at net.atomique.ksar.FileRead.run(FileRead.java:78)

https://github.com/ppalucha/ksar2/blob/master/src/main/java/net/atomique/ksar/Graph/Graph.java#L98

How about instead of returning from the method just continuing the loop and ignoring the broken data? At least in my case it wouldn't harm anything, but make life a lot easier, because it wouldn't even be worth to find those broken two lines of data.

ams-tschoening commented 6 years ago

I got confused during writing the issue, the problem is not with return, but with logging any Exception:

The header of my data says 13 columns, while only 12 are available in some lines for some reason. So accessing the column with index 12 fails 1. when trying to actually parse the data and 2. again in the catch handler when trying to log the problem. cols[12] is used in both cases, while the largest index is 11 for the broken lines. That results in the parsing of lines being stopped entirely, because the ArrayIndexOutOfBoundsException is propagated up in the stack.

So either don't log cols[i] in the fallback handler or add another catch for ArrayIndexOutOfBoundsException don't logging cols[i], but only the line itself. I guess that should keep parsing the other lines and should be enough to debug the broken line as well.

ppalucha commented 6 years ago

Hi, It took me a while ;-) Just thinking - having a corrupted file is a really rare case and usually means something went terribly wrong. So I'm not sure if it makes any sense to try to recover from such situations. I'll go for just fixing the error message. Thanks!

ppalucha commented 6 years ago

Actually, by default such errors are handled gracefully - so this one will also just print warning.