Open leifwalsh opened 11 years ago
Under "hardware", I see nothing but I'm not running munin
"last ping" looks good, "daily ping" is empty, don't know what that is
haven't tried profile data or logs, I assume profile would work as well as normal profiling works (which I haven't tried yet), and logging should be fine, probably not worth looking in to yet
actually "db storage" appears to show something useful
also need to try this with a replica set and a sharded cluster
"db stats" tab looks mostly fine
looks like all this info comes from 'serverStatus' so we should just add to that whatever we want to display
"page faults" normally shows when mongo has to go to disk to get the data it needs to fulfill a query
For us this would be ft fetches (but we can break it down a little farther than that).
--
Cheers, Leif
On Tue, Sep 10, 2013 at 10:41 PM, michaeldauria notifications@github.com wrote:
"page faults" normally shows when mongo has to go to disk to get the data it needs to fulfill a query
Reply to this email directly or view it on GitHub: https://github.com/Tokutek/mongo/issues/510#issuecomment-24209640
"db storage" is just how much disk space is used on disk, I am sure you have this info.
All the other comments make sense to me.
Yeah, I think I saw db storage working eventually, it just looked like it was blank at first because I had just started up mms
--
Cheers, Leif
On Tue, Sep 10, 2013 at 10:47 PM, michaeldauria notifications@github.com wrote:
"db storage" is just how much disk space is used on disk, I am sure you have this info.
All the other comments make sense to me.
Reply to this email directly or view it on GitHub: https://github.com/Tokutek/mongo/issues/510#issuecomment-24209817
Sometimes the mms is very slow to refresh when tokutek is under about 5k ops (the load on the server is not that huge though) Not sure it's caused by the mms agent or the Tokutek.
How often does MMS typically refresh? How slow is it on such a TokuMX instance?
On Thu, Sep 12, 2013 at 12:05 PM, byzhang notifications@github.com wrote:
Sometimes the mms is very slow to refresh when tokutek is under about 5k ops (the load on the server is not that huge though) Not sure it's caused by the mms agent or the Tokutek.
— Reply to this email directly or view it on GitHubhttps://github.com/Tokutek/mongo/issues/510#issuecomment-24333554 .
Cheers, Leif
An unrelated note that should go here. Something noticed: a small discrepancy we've noticed between the internal performance stats reported by TokuMX compared to MongoDB; specifically, the 'start and 'end' values of the 'oplog' section of an MMS agent ping:
...,
"oplog": {
"start": {"$date": "[ISO timestamp]"},
"rsStats": { ... },
"end": {"$date": "[ISO timestamp]"}
},
...
The discrepancy is that MongoDB reports these values as BSON Timestamps instead of BSON Dates. For reference, the start/end values are populated by these two python lines of our freely available MMS agent (blockingStats.py:224):
oplogStats["start"] = localConn[oplog].find( limit=1, sort=[ ( "$natural" , pymongo.ASCENDING ) ], fields={ 'ts' : 1 } )[0]["ts"] oplogStats["end"] = localConn[oplog].find( limit=1, sort=[ ( "$natural" , pymongo.DESCENDING ) ], fields={ 'ts' : 1} )[0]["ts"]
For that, I think either we should change the way we display these types to "seem like" what MMS wants, and just try to maintain the information we're actually presenting (which is just advisory estimates anyway, right?), or modify the MMS agent to interpret these values differently. For option 2, maybe we can package our own agent and make it still work with mms.mongodb.org, or maybe we need mongodb inc.'s help and changes in the webserver? Do you know what MMS does with these values? Would it still be meaningful for us to try to provide something for them?
On Mon, Sep 16, 2013 at 5:43 PM, zkasheff notifications@github.com wrote:
An unrelated note that should go here. Something noticed: a small discrepancy we've noticed between the internal performance stats reported by TokuMX compared to MongoDB; specifically, the 'start and 'end' values of the 'oplog' section of an MMS agent ping:
..., "oplog": { "start": {"$date": "[ISO timestamp]"}, "rsStats": { ... }, "end": {"$date": "[ISO timestamp]"} }, ...
The discrepancy is that MongoDB reports these values as BSON Timestamps instead of BSON Dates. For reference, the start/end values are populated by these two python lines of our freely available MMS agent (blockingStats.py:224):
oplogStats["start"] = localConn[oplog].find( limit=1, sort=[ ( "$natural" , pymongo.ASCENDING ) ], fields={ 'ts' : 1 } )[0]["ts"] oplogStats["end"] = localConn[oplog].find( limit=1, sort=[ ( "$natural" , pymongo.DESCENDING ) ], fields={ 'ts' : 1} )[0]["ts"]
— Reply to this email directly or view it on GitHubhttps://github.com/Tokutek/mongo/issues/510#issuecomment-24547227 .
Cheers, Leif
It tries to refresh every minute, but sometimes it take couple of minutes. On Sep 16, 2013 2:34 PM, "Leif Walsh" notifications@github.com wrote:
How often does MMS typically refresh? How slow is it on such a TokuMX instance?
On Thu, Sep 12, 2013 at 12:05 PM, byzhang notifications@github.com wrote:
Sometimes the mms is very slow to refresh when tokutek is under about 5k ops (the load on the server is not that huge though) Not sure it's caused by the mms agent or the Tokutek.
— Reply to this email directly or view it on GitHub< https://github.com/Tokutek/mongo/issues/510#issuecomment-24333554> .
Cheers, Leif
— Reply to this email directly or view it on GitHubhttps://github.com/Tokutek/mongo/issues/510#issuecomment-24546638 .
Ok. I don't think I saw this when I tried it with cortisol (which is a very high load). Would you be able to share your workload?
Cheers, Leif
On Mon, Sep 16, 2013 at 7:28 PM, byzhang notifications@github.com wrote:
It tries to refresh every minute, but sometimes it take couple of minutes. On Sep 16, 2013 2:34 PM, "Leif Walsh" notifications@github.com wrote:
How often does MMS typically refresh? How slow is it on such a TokuMX instance?
On Thu, Sep 12, 2013 at 12:05 PM, byzhang notifications@github.com wrote:
Sometimes the mms is very slow to refresh when tokutek is under about 5k ops (the load on the server is not that huge though) Not sure it's caused by the mms agent or the Tokutek.
— Reply to this email directly or view it on GitHub< https://github.com/Tokutek/mongo/issues/510#issuecomment-24333554> .
Cheers, Leif
— Reply to this email directly or view it on GitHubhttps://github.com/Tokutek/mongo/issues/510#issuecomment-24546638 .
Reply to this email directly or view it on GitHub: https://github.com/Tokutek/mongo/issues/510#issuecomment-24553249
5k updates per seconds, ~10 connections.
It might be worth documenting metrics that we should pay attention to. We use diamond and graphite to monitor tokumx. This would help the operational side and help identify issues by creating equivalent dashboards inside graphite if we knew what to watch out for.
@byzhang we believe there were some issues with MMS doing queries for the beginning of the oplog that would take a long time if there were a bunch of deletes from trimming. In 1.4, using a partitioned oplog instead of a trimmer should have fixed this problem completely.
@ankurcha the best stuff to monitor is in db.serverStatus()
and is documented in the User's Guide but we'll work on improving the documentation of how to monitor TokuMX effectively and update this ticket.