adobe / log-parser

A Tool that allows you to analyse, sort, filter, and search your logs. The added value is that all data is stored in Java objects that van be used for other purposes such as: assetions, reporting, automated decision making.
MIT License
17 stars 8 forks source link
log-intelligence logging logs logstash parser

log-parser

unit-tests codecov javadoc Quality Gate Status

This project was created to allow us to parse and analyze log files in order to gather relevant data. It can be used as is or as an SDK. Where you can define your own parsing.

The basic method for using this library is, that you create a definition for your parsing. This definition allows you to parse a set of log files and extract all entries that match this pattern.

The Processes

Table of contents

Installation

For now we are using this library with maven, in later iteration we will publish other build system examples:

Maven

The following dependency needs to be added to your pom file:

 <dependency>
    <groupId>com.adobe.campaign.tests</groupId>
    <artifactId>log-parser</artifactId>
    <version>1.0.10</version>
</dependency>

Parse Definitions

In order to parse logs you need to define a ParseDefinition. A ParseDefinition contains a set of ordered ParseDefinition Entries. While parsing a line of logs, the LogParser will see if all entries can be found in the line of logs. If that is the case, the line is stored according to the definitions.

Defining a Parsing

Each Parse Definition consists of :

Defining an entry

Each entry for a Parse Definition allows us to define:

How parsing works

When you have defined your parsing you use the LogDataFactory by passing it:

  1. The log files it should parse
  2. The ParseDefinition

By using the StringParseFactory we get a LogData object with allows us to manage the logs data you have found.

Parsing a log line

Anonymizing Data

We have discovered that it would be useful to anonymize data. This will aloow you to group some log data that contains variables. Anonymization has two features:

For example if you store an anonymizer with the value:

Storing key '{}' in the system

the log-parser will merge all lines that contain the same text, but with different values for the key. For example:

will all be stored as Storing key '{}' in the system.

Sometimes we just want to anonymize part of a line. This is useful if you want to do post-treatment. For example in our previous example as explained Storing key 'G' in the system, would be merged, however NEO-1234 : Storing key 'G' in the system would not be merged. In this cas we can do a partial anonymization using the [] notation. For example if we enrich our original template:

[]Storing key '{}' in the system

In this case the lines:

Code Example

Here is an example of how we can parse a string. The method is leveraged to perform the same parsing in one or many files.

@Test
public void parseAStringDemo() throws StringParseException {
    String logString = "afthostXX.qa.campaign.adobe.com:443 - - [02/Apr/2022:08:08:28 +0200] \"GET /rest/head/workflow/WKF193 HTTP/1.1\" 200 ";

    //Create a parse definition
    ParseDefinitionEntry verbDefinition = new ParseDefinitionEntry();
    verbDefinition.setTitle("verb");
    verbDefinition.setStart("\"");
    verbDefinition.setEnd(" /");

    ParseDefinitionEntry apiDefinition = new ParseDefinitionEntry();
    apiDefinition.setTitle("path");
    apiDefinition.setStart(" /");
    apiDefinition.setEnd(" ");

    List<ParseDefinitionEntry> definitionList = Arrays.asList(verbDefinition,apiDefinition);

    //Perform Parsing
    Map<String, String> parseResult = StringParseFactory.parseString(logString, definitionList);

    //Check Results
    assertThat("We should have an entry for verb", parseResult.containsKey("verb"));
    assertThat("We should have the correct value for logDate", parseResult.get("verb"), is(equalTo("GET")));

    assertThat("We should have an entry for the API", parseResult.containsKey("path"));
    assertThat("We should have the correct value for logDate", parseResult.get("path"),
            is(equalTo("rest/head/workflow/WKF193")));
}

In the code above we want to parse the log line below, and want to fin the REST call "GET /rest/head/workflow/WKF193", and to extract the verb "GET", and the api "/rest/head/workflow/WKF193". afthostXX.qa.campaign.adobe.com:443 - - [02/Apr/2022:08:08:28 +0200] \"GET /rest/head/workflow/WKF193 HTTP/1.1\" 200

The code starts with the creation a parse definition with at least two parse definitions that tell us between which markers should each data be extracted. The parse difinition is then handed to the StringParseFactory so that the data can be extracted. At the end we can see that each data is stored in a map with the parse defnition entry title as a key.

Import and Export

You can import or store a Parse Definition to or from a JSON file.

Importing a JSON File

You can define a Parse Definition in a JSON file.

This can then be imported and used for parsing using the method ParseDefinitionFactory.importParseDefinition. Here is small example of how the JSON would look like:

{
  "title": "Anonymization",
  "storeFileName": false,
  "storeFilePath": false,
  "storePathFrom": "",
  "keyPadding": "#",
  "keyOrder": [],
  "definitionEntries": [
    {
      "title": "path",
      "start": "HTTP/1.1|",
      "end": "|Content-Length",
      "caseSensitive": false,
      "trimQuotes": false,
      "toPreserve": true,
      "anonymizers": [
        "X-Security-Token:{}|SOAPAction:[]"
      ]
    }
  ]
}

Extracting Data from Logs

Using the Standard Method

By default each entry for your lag parsing will be stored as a Generic entry. This means that all values will be stored as Strings. Each entry will have a :

Using the SDK

Using the log parser as an SDK allow you to define your own transformations and also to override many of the behaviors.

Writing your own SDK

In order to use this feature you need to define a class that extends the class StdLogEntry.

You will often want to transform the parsed information into a more manageable object by defining your own fields in the SDK class.

Declaring a Default and Copy Constructor

You will need to declare a default constructor and a copy constructor. The copy constructor will allow you to copy the values from one object to another.

Declaring the transformation Rules in setValuesFromMap

You will need to declare how the parsed variables are transformed into your SDL. This is done in the method setValuesFromMap().

In there you can define a fine-grained extraction of the variables. This could be extracting hidden data in strings of the extracted data, or simple data transformations such as integer or dates.

Declaring the Key

You will need to define how a unique line will look like. Although this is already done in the Definition Rules, you may want to provide more precisions. This is doen in the method makeKey().

Declare the HeaderMap, and ValueMap

Depending on the fields you have defined, you will want to define how the results are represented when they are stored in your system.

You will need to give names to the headers, and provide a map that extracts the values.

Code Structure

Below is a diagram representing the class structure:

The Class relationship

Searching and organizing log data

As of versions 1.0.4 & 1.0.5 we have a series of search and organizing the log data.

Search and Filter Mechanisms

We have introduced the filter and search mechanisms. These allow you to search the LogData for values for a given ParseDefinitionEntry. For this we have introduced the following methods:

We currently have the following signatures:

public boolean isEntryPresent(String in_parseDefinitionName, String in_searchValue)
public boolean isEntryPresent(Map<String, Object> in_searchKeyValues)
public LogData<T> searchEntries(String in_parseDefinitionName, String in_searchValue)
public LogData<T> searchEntries(Map<String, Object> in_searchKeyValues)
public LogData<T> filterBy(Map<String, Object> in_filterKeyValues)

In the cases where the method accepts a map we allow the user to search by a series of search terms. Example:

Map<String, Object> l_filterProperties = new HashMap<>();
        l_filterProperties.put("Definition 1", "14");
        l_filterProperties.put("Definition 2", "13");

LogData<GenericEntry> l_foundEntries = l_logData.searchEntries(l_filterProperties)); 

GroupBy Mechanisms

We have introduced the groupBy mechanism. This functionality allows you to organize your results with more detail. Given a log data object, and an array of ParseDefinitionEntry names, we generate a new LogData Object containing groups made by the passed ParseDeinitionEnries and and number of entries for each group.

Let's take the following case:

Definition 1 Definition 2 Definition 3 Definition 4
12 14 13 AA
112 114 113 AAA
120 14 13 AA

If we perform groupBy with the parseDefinition Definition 2, we will be getting a new LogData object with two entries:

Definition 2 Frequence
14 2
114 1

We can also pass a list of group by items, or even perform a chaining of the group by predicates.

Passing a list

We can create a sub group of the LogData by creating group by function:

LogData<GenericEntry> l_myGroupedData = logData.groupBy(Arrays.asList("Definition 1", "Definition 4"));

//or 

LogData<MyImplementationOfStdLogEntry> l_myGroupedData = logData.groupBy(Arrays.asList("Definition 1", "Definition 4"), MyImplementationOfStdLogEntry.class);

In this case we get :

Definition 1 Definition 4 Frequence
12 AA 1
112 AAA 1
120 AA 1

Chaining GroupBy

The GroupBy can also be chained. Example:

LogData<GenericEntry> l_myGroupedData = logData.groupBy(Arrays.asList("Definition 1", "Definition 4")).groupBy("Definition 4");

In this case we get :

Definition 4 Frequence
AA 2
AAA 1

Comparing Log Data

As of version 1.11.0 we have introduced the possibility to compare two LogData objects. This is a light compare that checks that for a given key, if it is absent, added or changes in frequency. The method compare returns a LogDataComparison object that contains the results of the comparison. A comparison can be of three types:

Apart from this we return the :

These values are negative if the values have decreased.

Creating a differentiation report is done with the method LogData.compare(LogData<T> in_logData). This method returns a LogDataComparison object that contains the results of the comparison.

Creating a Differentiation Report

We can generate an HTML Report where the differences are high-lighted. This is done with the method LogDataFactory.generateComparisonReport(LogData reference, LogData target, String filename). This method will generate an HTML Report detailing the found differences.

Assertions and LogDataAssertions

As of version 1.0.5 we have introduced the notion of assertions. Assertions can either take a LogData object or a set of files as input.

We currently have the following assertions:

AssertLogData.assertLogContains(LogData<T> in_logData, String in_entryTitle, String in_expectedValue)

AssertLogData.assertLogContains(List<String> in_filePathList, ParseDefinition in_parseDefinition, String in_entryTitle, String in_expectedValue)

AssertLogData.assertLogContains(LogData<T>, String, String ) allows you to perform an assertion on an existing LogData Object.

AssertLogData.assertLogContains(List<String>, ParseDefinition, String, String) allows you to perform an assertion directly on a file.

Exporting Results to a CSV File

We now have the possibility to export the log data results into a CSV file. The file will be a concatenation of the Parse Definition file, suffixed with "-export.csv".

Release Notes

1.11.0 (next version)

1.0.10