Open funes79 opened 6 years ago
hi @funes79
do you have some code to show ?
Running the query for expensive queries in the last 7 days returns just 2 record:
from datetime import datetime, timedelta
api = ApiResource(cm_host, username="reader", password="cmreader", version=18)
c = api.get_all_clusters()[0]
for s in c.get_all_services():
if s.type == 'IMPALA':
impala = s
now = datetime.utcnow()
daysback = 7
start = now - timedelta(days=daysback)
end = now
print('> Scanning last %s days, from %s till %s ' % ( daysback, start, end) )
filterStr = 'memory_aggregate_peak >= 60GB'
queries = impala.get_impala_queries(start_time=start, end_time=end, filter_str=filterStr, limit=1000, offset=0)
for query in queries.queries:
print '> queyrid = '+query.queryId
Result:
Scanning last 7 days, from 2018-04-14 19:14:30.128989 till 2018-04-21 19:14:30.128989 queyrid = 4b4732a957d53eff:36aed14500000000 queyrid = 284a368120d0f55f:10950c0f00000000
But looping through and calling the get_impala_queries for shorter intervals returns more results:
for i in xrange(1,7):
start = now - timedelta(days=i)
end = now - timedelta(days=i-1)
print('> Scanning %s days ago, from %s till %s ' % ( daysback, start, end) )
filterStr = 'memory_aggregate_peak >= 60GB'
queries = impala.get_impala_queries(start_time=start, end_time=end, filter_str=filterStr, limit=1000, offset=0)
for query in queries.queries:
print '> queryid = '+query.queryId
Result:
Scanning 7 days ago, from 2018-04-20 19:14:30.128989 till 2018-04-21 19:14:30.128989 Scanning 7 days ago, from 2018-04-19 19:14:30.128989 till 2018-04-20 19:14:30.128989 queryid = 4b4732a957d53eff:36aed14500000000 queryid = 284a368120d0f55f:10950c0f00000000 Scanning 7 days ago, from 2018-04-18 19:14:30.128989 till 2018-04-19 19:14:30.128989 queryid = f4a9f149c0fc14d:520bc9ec00000000 queryid = 6a4b66e3cb8f1778:32ed8d7d00000000 queryid = bb488109a374db59:4eb2f8f700000000 queryid = b4963e9b6f9d1ea:9f55733800000000 .... and much more
hi @funes79 ok ,i get it , i will try it tomorrow .
hi @funes79
i am try to execute , it's execute normally, this it my code
import ...
def impala_query(cluster):
end = datetime.now()
start = end - timedelta(days=7)
print start, end
for s in cluster.get_all_services():
if s.type == 'IMPALA':
impala = s
q = impala.get_impala_queries(start_time=start, end_time=end, filter_str='database=xxx')
for i in q.queries:
print i.queryId
if __name__ == '__main__':
try:
cm_host = 'xx'
api = ApiResource(cm_host, username='reader', password='cmreader', version=6)
clusterName = api.get_all_clusters()[0]
impala_query(clusterName)
Cant be it somehow related to the fact, that the filtering takes some time, and in the GUI the first two rows appears immeadiately and then the others are fetched? So I think if the gui uses Solr or some other kind of index, then results are not available immeadiately, but in several steps..
This is typical for wiki search, where you get the first set of result and then a token, and with that token you continue to fetch more results
On Mon, Apr 23, 2018 at 4:33 AM, 刘志杰 notifications@github.com wrote:
hi @funes79 https://github.com/funes79
i am try to execute , it's execute normally, this it my code
import ... def impala_query(cluster): end = datetime.now() start = end - timedelta(days=7) print start, end for s in cluster.get_all_services(): if s.type == 'IMPALA': impala = s q = impala.get_impala_queries(start_time=start, end_time=end, filter_str='database=xxx') for i in q.queries: print i.statement if name == 'main': try: cm_host = 'xx' api = ApiResource(cm_host, username='reader', password='cmreader', version=6) clusterName = api.get_all_clusters()[0] impala_query(clusterName)
[image: image] https://user-images.githubusercontent.com/14147011/39104190-bf8489f2-46e1-11e8-9041-11e8ba1406e6.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cloudera/cm_api/issues/68#issuecomment-383437751, or mute the thread https://github.com/notifications/unsubscribe-auth/ATRCidza9SCuO5RSlOB8yt_eND3MK2V1ks5trT1-gaJpZM4TdKZ6 .
When I run a get_impala_queries from python it returns just 2 records, even if I use the same date range and filter. When I filter it in the CM UI, the first two records appears immediately, and then after a second the rest of those queries. Is it possible to get the next result somehow?