rfoltyns / log4j2-elasticsearch

Log4j2 Elasticsearch Appender plugins
Apache License 2.0
173 stars 46 forks source link

Support for ECSLayout for elasticsearch-ahc and / or elaticsearch-jest in combination with data streams #84

Open thaarbach opened 1 year ago

thaarbach commented 1 year ago

Description Support ECSLayout with log4j2-elasticsearch-ahc and/or log4j2-elasticsearch-jest

Why: If using appender in a centralized logging setup in combination with elastic-apm in a clustered environment it is much easier to setup the appender. Adding fields provided by the elastic-apm e.g. client.ip, trace.id, transaction.id, http.*, error.* etc. with VirtualProperties <VirtualProperty name="client.ip" value="$${ctx:client.ip}"/> doesn't work.

We have also tried to setup the appender with JestHttp with the use of data streams which won't work. maybe an configuration.

Configuration ahc

<Appenders>
    <Elasticsearch name="elasticsearch">
    <JacksonJsonLayout>
        <JacksonMixIn targetClass="org.apache.logging.log4j.core.LogEvent" mixInClass="org.appenders.log4j2.elasticsearch.json.jackson.LogEventJacksonEcsJsonMixIn"/>
        <NonEmptyFilter/>
        <VirtualProperty name="host.name" value="$${sys:hostName}.example.de"/>
        <VirtualProperty name="service.version" value="$${sys:elastic.apm.service_version}"/>
        <VirtualProperty name="service.name" value="$${sys:elastic.apm.service_name}"/>
        <VirtualProperty name="data_stream.type" value="logs"/>
        <VirtualProperty name="data_stream.dataset" value="$${sys:elastic.apm.service_name}.example.de"/>
        <VirtualProperty name="data_stream.namespace" value="$${sys:elastic.apm.environment}.example.de"/>
        <VirtualProperty name="client.ip" value="$${ctx:client.ip}"/>
        <PooledItemSourceFactory poolName="itemPool"
                                    itemSizeInBytes="1024"
                                    maxItemSizeInBytes="8192"
                                    initialPoolSize="500"
                                    monitored="true"
                                    monitorTaskInterval="10000"
                                    resizeTimeout="500">
            <UnlimitedResizePolicy resizeFactor="0.6"/>
        </PooledItemSourceFactory>
    </JacksonJsonLayout>

    <AsyncBatchDelivery batchSize="500" eliveryInterval="5000">
        <IndexTemplate apiVersion="8" name="log4j2-${sys:elastic.apm.service_name}" path="classpath:composableIndexTemplate.json"/>
        <ILMPolicy name="logs" createBootstrapIndex="false">
            {}
        </ILMPolicy>
        <AHCHttp name="http-main"
                    connTimeout="500"
                    readTimeout="30000"
                    gzipCompression="true"
                    maxTotalConnections="8"
                    serverUris="http://localhost:9200">
            <PooledItemSourceFactory poolName="batchPool"
                                        itemSizeInBytes="5120000"
                                        initialPoolSize="10"
                                        resizeTimeout="500">
                <UnlimitedResizePolicy resizeFactor="0.70"/>
            </PooledItemSourceFactory>
            <ElasticsearchDataStream />
            <BatchLimitBackoffPolicy maxBatchesInFlight="4"/>
            <ServiceDiscovery
                                    refreshInterval="5000"
                                    configPolicies="serverList">
            </ServiceDiscovery>
        </AHCHttp>
</AsyncBatchDelivery>

Configuration JestHttp

<Appenders>
    <Elasticsearch name="elasticsearch">
    <ECSLayout serviceName="${sys:elastic.apm.service_name}" eventDataset="${sys:elastic.apm.service_name}.log">
        <KeyValuePair key="host.name" value="${sys:hostName}.example.de"/>
        <KeyValuePair key="service.version" value="${sys:elastic.apm.service_version}"/>
        <KeyValuePair key="data_stream.type" value="logs"/>
        <KeyValuePair key="data_stream.dataset" value="${sys:elastic.apm.service_name}"/>
        <KeyValuePair key="data_stream.namespace" value="${sys:elastic.apm.environment}"/>
    </ECSLayout>
    <IndexName indexName="log4j2-${sys:elastic.apm.service_name}"/>
    <ThresholdFilter level="INFO" onMatch="ACCEPT"/>

    <AsyncBatchDelivery deliveryInterval="5000" batchSize="500" shutdownDelatMillis="10000">
        <IndexTemplate apiVersion="8" name="log4j2-${sys:elastic.apm.service_name}" path="classpath:composableIndexTemplate.json"/>
        <ILMPolicy name="logs" createBootstrapIndex="false">
            {}
        </ILMPolicy>
        <JestHttp serverUris="http://localhost:9200" dataStreamsEnabled="true"/>
        <AppenderRefFailoverPolicy>
            <AppenderRef ref="stderr"/>
        </AppenderRefFailoverPolicy>
    </AsyncBatchDelivery>
    </Elasticsearch>
</Appenders>

Additional ILMPolicy with createBootstapIndex only works, if an empty template is provided

<ILMPolicy name="logs" createBootstrapIndex="false">
    {}
</ILMPolicy>
thaarbach commented 1 year ago

Got it working with JestHttp. Forgot to change the dependency 🙈

rfoltyns commented 1 year ago

I'm glad you got it working :clap:

Watch out for the double $ for dynamic VirtualProperty-ies. They need to use dynamic="true" flag to resolve correctly:

<VirtualProperty name="client.ip" value="$${ctx:client.ip}" dynamic="true"/>

.. and ctx doesn't work with AsyncLogger - that's due to Log4j2 not copying over thread context vars in async mode.

Also, I highly recommend AHCHttp. It's much more performant and has much lower footprint that JestHttp. It should work with netty-all jar. AHC module has become the new "tip of the spear". It will be the main focus of further development and part of it's code will become the backbone of http layer in 2.0.

I hacked the HC example to use your configuration. After a few tweaks it worked like a charm (mappings are still missing, but I'm sure you'll figure it out)

Try

mvn clean install && java -jar -Delastic.apm.environment=apm-test -Delastic.apm.service_name=elasticsearch-ahc log4j2-elasticsearch-hc-springboot/target/log4j2-elasticsearch-hc-springboot-0.0.1-SNAPSHOT.jar

and then

curl -XPOST -H 'Content-Type: application/json' -d '{}' http://localhost:9200/log4j2-elasticsearch-ahc/_search | jq

on this branch

thaarbach commented 1 year ago

Hey Rafal,

ah, i thought the double $ is needed for escaping reasons. Got it from (https://github.com/rfoltyns/log4j2-elasticsearch/tree/master/log4j2-elasticsearch-core#virtual-properties)

.. and ctx doesn't work with AsyncLogger - that's due to Log4j2 not copying over thread context vars in async mode.

If i understand the log4j2.x system property log4j2.isThreadContextMapInheritable right, then the child thread inherit the Thread Context Map. This is explained here https://logging.apache.org/log4j/2.x/manual/thread-context.html#configuration. But maybe, i am wrong.

Also, I highly recommend AHCHttp. It's much more performant and has much lower footprint that JestHttp.

Thats why i'm tried AHCHttp first ;-).

rfoltyns commented 1 year ago

It works as an escape char, yes. First $ is replaced somewhere around loading the xml file - Log4j2 goodness - only one remains and is VirtualProperty.dynamic=false, it will be replaced while building the serialiser at this line in JacksonSerializer.

the child thread inherit the Thread Context Map

I never got ctx working with StrSubstitutor. If values are copied, they're not copied where VirtualProperty would expect it.

rfoltyns commented 1 year ago

@thaarbach Is there anything else we can address regarding the original issue? Seems like the code - at least in this repository :) - works as advertised

thaarbach commented 1 year ago

@rfoltyns i'm using AsyncLoggers and set some informations e.g. the current user.id in the ThreadContext (in our case MDC because of using SLF4J) and they are indexed when log4j2.isThreadContextMapInheritable=true. Mayby there is some magic done by SLF4J.

Is there anything else we can address regarding the original issue? Seems like the code - at least in this repository :) - works as advertised

Whould be nice, if ECSLayout also works with elasticsearch-ahc

rfoltyns commented 1 year ago

Is the log4j2-elasticsearch-examples branch I mentioned above not working as you expect?

thaarbach commented 1 year ago

Is the log4j2-elasticsearch-examples branch I mentioned above not working as you expect?

Didn't try it yet. At the moment i'm fighting with ingest piplines, because they won't be executed on index requests.

Are ingest pipelines supported per request?

rfoltyns commented 1 year ago

I never played around with these tbh. Elasticsearch docs mentions index.default_pipeline setting. See what happens once you provide it in index-template/component-template-settings. With data stream and index template settings setup provided by this plugin, should work nicely..?

I'll look into Pipeline API and see if there's a possibility to properly support it in 1.7

thaarbach commented 1 year ago

Got it working. The trick is set the pipeline as index.final_pipeline and force an reindex or rollover.

Old data can be updated with POST my_data_stream/_update_by_query?pipeline=my_pipeline

Now i'm be able to enrich the log entries with geo data and decode urls :-)