elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
109 stars 4.93k forks source link

Update MongoDB protocol with new opcodes #6191

Open pooqadmin opened 6 years ago

pooqadmin commented 6 years ago

Please post all questions and issues on https://discuss.elastic.co/c/beats before opening a Github Issue. Your questions will reach a wider audience there, and if we confirm that there is a bug, then you can open a new issue.

For security vulnerabilities please only send reports to security@elastic.co. See https://www.elastic.co/community/security for more information.

For confirmed bugs, please report:

[root@contents-mongo-dev-01 packetbeat-6.1.2-linux-x86_64]# /beat/packetbeat-6.1.2-linux-x86_64/packetbeat -c /beat/packetbeat-6.1.2-linux-x86_64/packetbeat.yml -e

Q) When packetbeat is executed, the following error occurs.

2018/01/26 06:30:01.311175 mongodb_parser.go:42: ERR Unknown operation code: 2018/01/26 06:30:01.311872 mongodb_parser.go:42: ERR Unknown operation code: 2018/01/26 06:30:01.311902 mongodb_parser.go:42: ERR Unknown operation code: 2018/01/26 06:30:01.311928 mongodb_parser.go:42: ERR Unknown operation code: 2018/01/26 06:30:01.403019 mongodb_parser.go:42: ERR Unknown operation code: 2018/01/26 06:30:01.403631 mongodb_parser.go:42: ERR Unknown operation code:

Q) My setup is as below and I simply linked it to the logstash. Please check the cause of the above error.

[root@contents-mongo-dev-01 packetbeat-6.1.2-linux-x86_64]# cat packetbeat.yml | grep -v '#'

packetbeat.interfaces.device: any

packetbeat.flows: timeout: 30s period: 10s

packetbeat.protocols:

setup.template.settings: index.number_of_shards: 3 setup.kibana:

output.elasticsearch: hosts: ["internal-bh-elasticsearch-lb-1602908268.ap-northeast-2.elb.amazonaws.com:9200"]

logging.level: error

adriansr commented 6 years ago

It seems the mongodb parser needs to be updated with new operations added to the protocol.

Btw I will push a quick fix for that error message to print the actual operation code

dolftax commented 6 years ago

This missing ones are op_command and op_commandreply. As per docs, both these opcodes are

Should we really implement this? @adriansr

adriansr commented 6 years ago

@jaipradeesh I see, if they are deprecated then implementing might be a waste of time. However, we should at least expect to see those messages and don't log an error every time.

dolftax commented 6 years ago

@adriansr https://github.com/elastic/beats/pull/6440 works?

dolftax commented 6 years ago

freenode/#mongodb logs - 2018-02-27

15:41 < dolftax> In need of pcap files to test OP_COMMAND and OP_COMMANDREPLY opcodes // How to trigger them?
15:46 <@Derick> dolftax: let me check for you
15:46 <@Derick> dolftax: what do you need the pcap files for?
15:47 <@Derick> dolftax: I believe these are used by the mongos balancer in a sharded environment
15:48 <@Derick> and only with MongoDB 3.4 (although 3.2 and 3.6 support it too for up/downgrade)
15:49 < dolftax> Just need to know it some calls to DB from the client returns such opcodes. Since what queries is made by the client is not in my control, just need to run a quick test on the response opcodes and see if it sends any invalid opcodes.
15:49 < dolftax> invalid -> not whitelisted by us.
15:49 <@Derick> I would strongly suggest you do not white or blacklist any opcodes that MongoDB uses, as this can change in the future without warning
15:51 <@Derick> But: mongos 3.4 balancer commands use it
15:52 <@Derick> and to illustrate that op codes get added all the time: https://derickrethans.nl/wireshark-mongo-36.html
15:53 < dolftax> So, https://docs.mongodb.com/manual/reference/mongodb-wire-protocol/ are not really a whitelist of opcodes?
15:54 <@Derick> no
15:54 <@Derick> it misses things
15:55 <@Derick> do not restrict what MongoDB can send/receive

Derick is with the MongoDB team.

@ruflin In such case, I don't think validating across the whitelist of opcodes is necessary. What do you think?

// @adriansr

ruflin commented 6 years ago

I leave it to @adriansr to comment :-)

dolftax commented 6 years ago

@adriansr What do you think?

adriansr commented 6 years ago

Hi @jaipradeesh

Sorry about taking so long to respond to this, I screwed up my github notifications.

So I understand the whole point in here is to have packetbeat ignore these opcodes and not log an error every time it receives one. I don't think this is a problem, maybe Derick from MongoDB is understanding something different as to what whitelist and blacklist mean. We are not blocking this opcodes from being sent, we just don't want to store an event in Elasticsearch nor log an error.

Have you been able to test it and confirm that no more ERR Unknown operation code: are printed with the code in #6440 ?

dolftax commented 6 years ago

Okay. So, if we don't maintain a list of supported opcodes, then in those cases, Err Unknown operation code would be printed. PR #6440 adds OP_COMMAND and OP_COMMANDREPLY. But, say OP_COMPRESSED would still return ERR Unknown operation code:.

cwurm commented 6 years ago

MongoDB added another opcode OP_MSG in version 3.6. Since it seems it's already used in the wild it would make sense to support it as well if we can. We can open a separate issue if needed.

pohzipohzi commented 6 years ago

I'm interested in taking up this issue (adding the new OP_MSG opcode), but I'm not sure what is a good way to do so while ensuring backward compatibility. This is because OP_MSG already existed as opCode 1000 previously (see this) and already exists as such in the code.

One way I can think of is to query mongoDB for its version when packetbeat starts, but might have to introduce a new dependency (eg mongo-go-driver). Being able to differentiate versions could also be useful in future as mongoDB opcodes seem to change quite frequently. Otherwise if we are not concerned about ensuring backward compatibility we can simply replace the old code with the new one.

Should I also open a separate issue for this?

adriansr commented 6 years ago

@pohzipohzi to me querying mongoDB is a big no-no. Packetbeat analysis should stay passive. Isn't the protocol versioned so we can tell which version is in use?

cwurm commented 6 years ago

@pohzipohzi Looking at the MongoDB documentation it looks like the "new" OP_MSG has a value of 2013, so I suspect you could just add logic for that?

pohzipohzi commented 6 years ago

@cwurm I was thinking if mongoDB overwrites one of the opcodes in future, this solution might become a problem. However assuming that that does not happen this is a fine option.

@adriansr I agree that querying mongoDB is not the way to go. I have not explored looking at version protocols, but I think I shall stick with @cwurm 's solution for now.

Thanks for the input!

tuthan commented 6 years ago

Waiting for this one. Our MongoDB monitoring use this and we are going to upgrade to MongoDB 4.0 on PROD this week. After that we will lose the ability to see the MongoDB traffic.

goroi commented 5 years ago

Hi, Is there any update on this issue. We have upgraded to Mongo 4.0.6 and packetbeat-6.6.2 still does not have support for mongo-4.

chinaxushi commented 3 years ago

Now the latest version of packetbeat(7.15) can recognize the information of mongodb OP_MSG. But there seem to be some defects.When packet beat parses OP_MSG type, the output field mongodb is empty. In addition, the most important thing is that there is no end time and no overall response time in the event output field.

Parsing output of type OP_MSG: "mongodb": {}, "resource": "", "event": { "start": "2021-10-23T08:28:16.778Z", "category": [ "network_traffic", "network" ], "type": [ "connection", "protocol" ], "kind": "event", "dataset": "mongodb" }

Parsing output of earlier types: "mongodb": { "fullCollectionName": "admin.$cmd", "numberToSkip": 0, "numberToReturn": 4294967295, "cursorId": 0, "startingFrom": 0, "numberReturned": 1 }, "resource": "admin.$cmd", "event": { "type": [ "connection", "protocol" ], "kind": "event", "dataset": "mongodb", "duration": 139884, "start": "2021-07-27T08:27:27.473Z", "end": "2021-07-27T08:27:27.473Z", "category": [ "network_traffic", "network" ] },

botelastic[bot] commented 1 year ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

chinaxushi commented 1 year ago

^_^

botelastic[bot] commented 3 months ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

Weranders commented 3 months ago

This is still very relevant. I'm surprised it doesn't have more attention. :+1:

jacbo0112 commented 3 months ago

28858 Here is a detailed description of the problem