rueckstiess / mtools

A collection of scripts to set up MongoDB test environments and parse and visualize MongoDB log files.
Apache License 2.0
1.88k stars 400 forks source link

why mloginfo result has count zero and other statistics is dash #669

Closed zhugw closed 6 years ago

zhugw commented 6 years ago

After executing mloginfo mongo.log --queries a little confused with some results

namespace operation pattern count min (ms)    max (ms)    mean (ms)    95%-ile (ms)    sum (ms)
foo.bar1 update {"guid": 1} 0    -           -           -            n/a             -
foo.bar2 update {"guid": 1} 0    -           -           -            n/a             -
foo.bar3 update {"start_tag": 1, "storage_time": 1} 5         107         150          132    n/a             663

why some count of update records could be 0 and the other statistics item is -?

In mongo.log the detail like this 2018-10-10T09:30:22.768+0000 I WRITE [conn1119708] update foo.bar query: { guid: "http://www.csteelnews.com/qypd/scyx/201810/t20181010_370882.html_0" } update: { $set: { ... }, $setOnInsert: { ... } } keysExamined:0 docsExamined:0 nMatched:1 nModified:1 upsert:1 keyUpdates:0 writeConflicts:0 numYields:0 locks:{ Global: { acquireCount: { r: 2, w: 2 } }, Database: { acquireCount: { w: 2 } }, Collection: { acquireCount: { w: 1 } }, Metadata: { acquireCount: { w: 1 } }, oplog: { acquireCount: { w: 1 } } } 186ms

stennie commented 6 years ago

@zhugw Can you confirm your specific versions of mloginfo and MongoDB server? Can you also include some additional log lines?

With mloginfo --queries (v1.5.1) I get output for your sample log line:

namespace

operation    pattern        count    min (ms)    max (ms)    mean (ms)    95%-ile (ms)    sum (ms)
()
foo.bar      update       {"guid": 1}        1         186         186          186           186.0    186
zhugw commented 6 years ago

@stennie

$ mloginfo --version
mtools version 1.3.2

$ mongo 127.0.0.1:27017
MongoDB shell version: 3.2.17

additional log lines

2018-10-09T17:46:33.481+0000 I WRITE    [conn2993640] update foo.bar query: { guid: "http://news.pconline.com.cn/1179/11796204.html_0" } update: { $set: { followers_count: 0, reposts_count: 0, comments_count: 0, praises_count: 0, read_count: 0 }, $setOnInsert: { page_url: "http://news.pconline.com.cn/1179/11796204.html", post_title: "
      LG Watch W7亮相!带有经典机械指针的智能手表
    ",  platform: "百度新闻",  text: "太平洋电脑网
                                                2018年10月04日11:55

此前就有外媒报道称,在今日LG的发布会上,除了LGV40ThinQ这款新一代旗?...",  text_length: 141 } } keysExamined:1 docsExamined:1 nMatched:1 nModified:0 keyUpdates:0 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 24, w: 24 } }, Database: { acquireCount: { w: 24 } }, Collection: { acquireCount: { w: 14 } }, Metadata: { acquireCount: { w: 10 } }, oplog: { acquireCount: { w: 10 } } } 183ms

2018-10-11T02:00:12.454+0000 I WRITE    [conn2993640] update foo.bar query: { guid: "http://zhidao.baidu.com/question/291826183.html_0" } update: { $set: { followers_count: 0, reposts_count: 0, comments_count: 3, praises_count: 71, read_count: 5425 }, $setOnInsert: { page_url: "http://zhidao.baidu.com/question/291826183.html", post_title: "什么品牌的抽油烟机又好又便宜又好用?", platform: "百度知道", text: "
格林格抽油烟机相比其它牌子还是挺不错的。我家里用的那款格林格抽油烟机就是当年我买的,真正免清洗的功...",  text_length: 85 } } keysExamined:1 docsExamined:1 nMatched:1 nModified:1 keyUpdates:0 writeConflicts:0 numYields:1 locks:{ Global: { acquireCount: { r: 82, w: 82 } }, Database: { acquireCount: { w: 82 } }, Collection: { acquireCount: { w: 46 } }, Metadata: { acquireCount: { w: 36 } }, oplog: { acquireCount: { w: 36 } } } 103ms

Does there are some special character(e.g.\n) in text field affect mloginfo ? e.g. actually the text field in mongodb

"text" : "\n\n\n格林格抽油烟机相比其它牌子还是挺不错的。我家里用的那款格林格抽油烟机就是当年我买的,真正免清洗的功能,到现在应该也用了3年左右了吧,基本上没有什么大毛病,价格挺实惠",

"text" : "中国钢铁新闻网  \t\t\t\t\n\t\t\t\t\t\t26分钟前\n\t\t\t\n日前,方大特钢打造的生态森林旅游式工厂进入冲刺阶段,各项目在加紧推进建设的同时,该公司员工充分发挥自己的聪明才智,精心构思,用灵巧的双手“扮靓”钢城。他们利用...\n\n\n\n\n"
stennie commented 6 years ago

@zhugw Log lines with embedded newlines (\n) aren't well handled as at mtools 1.5.1 so your mloginfo --queries output is unfortunate but expected. This is the same issue as #416, but we haven't had anyone keen to address it yet.

Log lines with embedded \n will be treated as multiple log lines but only the first line (which includes the leading timestamp) will match for extracting log details.

There are some assumptions about line-based parsing and matching in the current code that make this more involved than a quick fix, but if you are interested in fixing a PR would be welcome.

I'm going to close this as a dupe of #416.

Regards, Stennie

zhugw commented 6 years ago

@stennie Today I cloned the code and tried to debug the mloginfo.py tried to find where to parse the raw log file. But very frustrating, I even cannot start mloginfo.py it successfully . When I run it I got below error:

this tool can't parse input from stdin.

So how could run it successfully in pycharm?

Although I could run nosetests on test_mloginfo in terminal

 nosetests --tests=test_mloginfo

but in this case I cannot debug it