nielsbasjes / logparser

Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Flink, Beam, Storm, Drill, ...
Apache License 2.0
158 stars 41 forks source link

Throws exceptions when missing field keys #257

Closed hiql closed 1 year ago

hiql commented 1 year ago

Hi, Bro:

I use your lib to parse tomcat access logs, and I got exceptions because of missing the field key "BYTES:request.body.bytes", I understand that it is an intentional behavior, but sometimes some fields are not important, we hope to continue the process, and to give a null value to the missing fields.

how should I do?

In TomcatAccessLog.java


    @Field("BYTES:request.body.bytes")
    public void setBytesReceived(final String value) {
        results.put("BYTES:request.body.bytes", value);
    }

format: %h %l %u %t "%r" %s %b

log: 223.104.131.76 - - [09/Feb/2023:00:00:17 +0800] "POST /cashier/submit HTTP/1.1" 200 1554

exception: nl.basjes.parse.core.exceptions.MissingDissectorsException: BYTES:request.body.bytes

nielsbasjes commented 1 year ago

You are asking for a field that does not exist in the specified logformat.

You have the response bytes (%b) yet you ask for the request bytes ... which is simply not part of the provided logformat.

As a consequence you get this intentional error because this cannot yield a value in any situation.

I see the Tomcat documentation shows:

%b - Bytes sent, excluding HTTP headers, or '-' if zero

I consider this confusing... sent to the server or sent to the client?.

The Apache HTTPD documentation says it clearer:

%b - Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a '-' rather than a 0 when no bytes are sent.

When using my parser the easiest way to get information about what you CAN get out of a logformat is doing this:

String logFormat = "%h %l %u %t \"%r\" %s %b";
Parser<Object> parser = new HttpdLoglineParser<>(Object.class, logFormat);
parser.getPossiblePaths().forEach(System.out::println);

which will give you (among many other things)

BYTES:response.body.bytes
hiql commented 1 year ago

I see.

I found this method recordHttpdLoglineParser.ignoreMissingDissectors();

Thanks for your help:)

nielsbasjes commented 1 year ago

Just to be clear: The system says you will never receive a value on the requested value because there is no possible situation where it can calculate this value. And you solution is to ignore this error.

Curious: What is the benefit of doing it this way in your project?