neo4j / neo4j-browser

Neo4j Browser is the general purpose user interface for working with Neo4j. Query, visualize, administrate and monitor the database.
https://neo4j.com
GNU General Public License v3.0
678 stars 346 forks source link

[Logging] annotationData as json structure instead of java map toString() output #1926

Closed ikwattro closed 1 year ago

ikwattro commented 1 year ago

Setup

Version: 5.6.0-enterprise OS: any

Transaction metadata can be set for every transaction executed against neo4j from a client, for eg

chris@neo4j> :begin
chris@neo4j# CALL tx.setMetaData({app: 'my-app'});
0 rows
ready to start consuming query after 22 ms, results consumed after another 0 ms
chris@neo4j# MATCH (n) RETURN count(n);
+----------+
| count(n) |
+----------+
| 2403     |
+----------+

1 row
ready to start consuming query after 11 ms, results consumed after another 1 ms
chris@neo4j# :commit
chris@neo4j>

When query logging is enabled and configured to be logged as json format, this is the following output :

{"time":"2023-03-25 18:02:02.801+0000","level":"INFO","event":"success","type":"query","elapsedTimeMs":11,"allocatedBytes":176,"source":"bolt-session\tbolt\tneo4j-cypher-shell/v4.4.14\t\tclient/172.30.0.1:38856\tserver/172.30.0.2:7687>","database":"neo4j","username":"chris","executingUser":"chris","authenticatedUser":"chris","query":"MATCH (n) RETURN count(n);","queryParameters":"{}","runtime":"pipelined","annotationData":"{app: 'my-app'}"}

As you can see, the annotationData field of the json body contains a standard java output of the Map::toString() method, preventing common log query processors to access easily values inside that field as they're simple strings and we need to resort to regexes or even in log4j2 filters ( like here https://github.com/ikwattro/neo4j-log4j2-examples/blob/main/tx-metadata-filter/server-logs.xml#L50 ) where normally we could just filter on a nested field.

Expected :

When json query logging is enabled, the value of the annotationData field should be a proper json object string, in this case :

"annotationData": {"app:" "my-app"}
klaren commented 1 year ago

Hello and thanks for the suggestion!

The output is formatted as a cypher map, which is similar to Map::toString(). We cannot change this in a backward-compatible way, but we could add another configuration setting, e.g. db.logs.query.annotationDataAsJson=true, to change this behavior. I think this is more reasonable, so I hope we can make that the default setting for the next major version.

I hope we can get this done for the next release.

ikwattro commented 1 year ago

Hi @klaren Thanks for replying. Having a setting to be able to use the new behaviour would be really amazing, looking forward to it.

klaren commented 1 year ago

@ikwattro I've added the setting db.logs.query.annotation_data_as_json_enabled that will toggle this behavior, the default is off. It should be part of the next release, Neo4j 5.8, scheduled for later this month.

ikwattro commented 1 year ago

Hi @klaren

I've been able to test it, its awesome. Thanks a lot ! Closing this ticket now.

ikwattro commented 1 year ago

Hi @klaren ,

I know this ticket is fixed but I wanted to share some feedback. I believe the implementation is that it will affect only the display of the annotation data but internally its different.

The reason I'm asking is that if for eg I want to prevent any logs of type system to be produced, I can't do this right now in log4j2 ( it just doesn't filter things out )

<Console name="ConsoleAppender" target="SYSTEM_OUT">
            <Filters>
                <MapFilter onMatch="DENY" onMismatch="ACCEPT">
                    <KeyValuePair key="annotationData.type" value="system"/>
                </MapFilter>
            </Filters>
            <JsonTemplateLayout eventTemplateUri="classpath:org/neo4j/logging/QueryLogJsonLayout.json"/>
        </Console>

I think it would be a great 6.x improvement as we have some customers dealing with very large query logs which would benefit from this, for eg being able to not log queries generated by internal ETL tooling.

klaren commented 1 year ago

Oh, so the map filter does not support nested MapMessages, that's a bit unfortunate. The convention appears to be collapsing the map with dots, e.g.

{
    "annotationData": {
        "type": "system",
        "source": "foo"
    }
}

to

{
    "annotationData.type": "system",
    "annotationData.source": "foo"
}

Since the newly introduced behavior breaks the convention I think we should update this for the next release, treating it as a bug. I'll see what can be done.

klaren commented 1 year ago

@ikwattro Neo4j 5.9 will come with the flattened objects as described above. Thanks for finding this shortcoming!

ikwattro commented 1 year ago

@klaren I tested Neo4j 5.9 and I confirm the following log4j2 filter works as expected :

<Console name="ConsoleAppender" target="SYSTEM_OUT">
            <Filters>
                <MapFilter onMatch="DENY" onMismatch="ACCEPT">
                    <KeyValuePair key="annotationData.txMetadata.type" value="system"/>
                </MapFilter>
            </Filters>
            <JsonTemplateLayout eventTemplateUri="classpath:org/neo4j/logging/QueryLogJsonLayout.json"/>
        </Console>

Thanks a lot for the support !

ikwattro commented 1 year ago

@klaren Taking back what I said :)

It seems that :

When its a system query, the flattened version is annotationData.txMetadata.type , but when its a user query, it's annotationData.type , is it expected ?

neo4j-5-docker-neo4j-1  | {"time":"2023-06-16 08:12:19.127+0000","level":"INFO","allocatedBytes":-1,"annotationData.txMetadata.app":"neo4j-browser_v5.9.0","annotationData.txMetadata.type":"system","authenticatedUser":"neo4j","boltAgent":{},"database":"system","elapsedTimeMs":7,"event":"success","executingUser":"neo4j","id":"331","pageFaults":0,"pageHits":0,"planning":4,"query":"CALL dbms.showCurrentUser()","queryParameters":"{}","runtime":"system","source":"bolt-session\tbolt\tneo4j-browser/v5.9.0\t\tclient/192.168.96.1:48258\tserver/192.168.96.2:7687>","transactionId":701,"type":"query","waiting":0}

vs

neo4j-5-docker-neo4j-1  | {"time":"2023-06-16 08:12:40.504+0000","level":"INFO","allocatedBytes":0,"annotationData.app":"neo4j-browser_v5.9.0","annotationData.type":"user-direct","authenticatedUser":"neo4j","boltAgent":{},"database":"neo4j","elapsedTimeMs":0,"event":"start","executingUser":"neo4j","id":"347","pageFaults":0,"pageHits":0,"planning":0,"query":"MATCH (n) RETURN count(n)","queryParameters":"{}","runtime":"null","source":"bolt-session\tbolt\tneo4j-browser/v5.9.0\t\tclient/192.168.96.1:49754\tserver/192.168.96.2:7687>","transactionId":53,"type":"query","waiting":0}
klaren commented 1 year ago

@ikwattro no, that is not expected. How are you setting the metadata?

ikwattro commented 1 year ago

I'm not setting anything, just using the neo4j browser.

On Fri, 16 Jun 2023 at 16:32, Anton Klarén @.***> wrote:

@ikwattro https://github.com/ikwattro no, that is not expected. How are you setting the metadata?

— Reply to this email directly, view it on GitHub https://github.com/neo4j/neo4j-browser/issues/1926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJKK6JRY4H33DPTBFNGYDTXLRU6FANCNFSM6AAAAAAWHVIY24 . You are receiving this because you were mentioned.Message ID: @.***>

--

Christophe Willemsen | CTO | GraphAware

+39 345 415 5709

@.*** | @graph_aware https://twitter.com/graph_aware | www.graphaware.com

Start your graph journey with Hume https://graphaware.com/products/hume/: Mission-critical graph analytics!

This communication and information in it is to be treated as confidential, may be the subject of legal privilege and is protected by copyright. If you are not the intended recipient of this communication, please contact the sender immediately.

klaren commented 1 year ago

Then it sounds like the javascript driver that is bundled with the browser is doing something inconsistent. I'll do some digging and get back to you.

OskarDamkjaer commented 1 year ago

@ikwattro Hello! This was indeed caused by Browser, we made the background polling more efficient in 5.9.0 and this bug slipped in.

klaren commented 1 year ago

@ikwattro Just a heads up, the next release(5.12.0) will include a new setting based on additional feedback, db.logs.query.annotation_data_format, which will replace db.logs.query.annotation_data_as_json_enabled.

Possible values for db.logs.query.annotation_data_format

If db.logs.query.annotation_data_as_json_enabled is set to true, it will be treated as db.logs.query.annotation_data_format=FLAT_JSON to preserve backward compatibility. This new setting will allow for greater flexibility and support more use cases.

ikwattro commented 1 year ago

Hi @klaren , great, I think its even better, I had to do some funny formatting in loki to handle baz.k ( nothing major, I was living happy with it )

Feel free to send me a snapshot of 5.12 over Slack so I can test it before ;-)

Cheers

Christophe