akardapolov / ASH-Viewer

ASH Viewer provides a graphical view of active session history data within the Oracle and PostgreSQL DB
GNU General Public License v3.0
167 stars 72 forks source link

Wait time can be -1 (WAITED SHORT TIME) or (WAITED KNOWN TIME) for a previous wait #13

Open Airell opened 6 years ago

Airell commented 6 years ago

https://github.com/akardapolov/ASH-Viewer/blob/b105b52d225ece2e7ae4ed273e8553f6e7460007/src/org/ash/database/Database10g11gSE.java#L349

I have a comment regarding WAIT_TIME which is retrieved in line 349.

Wait time can be -1, which means state = WAITED SHORT TIME (= less than 1 ms). Wait time can be any number, but the state = WAITED KNOWN TIME, which means 'Duration of the last wait is specified in the WAIT_TIME column'.

The first wait time -1 will trigger a 100% cpu for that session I think, which should be 1/100 of a second of cpu. The second wait shows the last wait time of a previous wait, but is still accounted for in a current cpu wait. If two sessions have these waits, they will both be reported as using cpu, but the are not doing anyting (much). I see this a lot for APEX_REST_PUBLIC_USER.

Session state should also be considered looking at these values: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_3023.htm#r18c1-t30

Regards, Ian Hoogeboom.

akardapolov commented 6 years ago

Hi!

The first wait time -1 will trigger a 100% cpu for that session I think, which should be 1/100 of a second of cpu.

Propose to check Top Activity of EM and ASHV chart. I did it and algorithm of display data for ASHV was the same as in the EM.

Thanks, Alex.

Airell commented 6 years ago

wait_time

cpu_main

Hello Alex,

This output (wait_time.png) is directly from your query from the code. Even if these are idle events, they are picked up in line 305 of the Database10g11gSE.java as being CPU used. The -1 wait_time will be incorrectly interpreted as 100% and the 65467 wait time is not a wait time I think, the state show it's a previous wait time, not a current one. Als the Oracle doumentation states, look also at the state and preferably use WAIT_TIME_MICRO. "This column has been deprecated in favor of the columns WAIT_TIME_MICRO and STATE."

I did check with SQL Developer Instance Viewer and Toad. I don't have EM for testing this right now, but the query output, code check and the attached screenshot (cpu_main.png) of the main screen (12:15-12:20 selected) shows 'idle / no cpu' being translated to 100% cpu where it's not the case. Other DBA's from my company have recorded this glitch too.

Kind regards, Ian Hoogeboom.

On Wed, 1 Aug 2018 at 12:40, Alexander Kardapolov notifications@github.com wrote:

Hi!

The first wait time -1 will trigger a 100% cpu for that session I think, which should be 1/100 of a second of cpu.

Propose to check Top Activity of EM and ASHV chart. I did it and algorithm of display data for ASHV was the same as in the EM.

Thanks, Alex.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/akardapolov/ASH-Viewer/issues/13#issuecomment-409531905, or mute the thread https://github.com/notifications/unsubscribe-auth/AFn7KGiOUoUNVnvPj_0j6DqZs63umwTxks5uMYWOgaJpZM4VqM52 .

Airell commented 6 years ago

sql_developer_db_cpu_query

Attached the output fo the internal CPU query from SQL Developer and graph from ASH viewer. The 3 APEX_REST SID's (136, 308, 379) show almost an evenly distributed 95% of the cpu load in ASH Viewer. The DB CPU time from SQL Developer show very low CPU. To compare, the 309 SID has much higher CPU number in SQL Developer, but only 2.86% in ASH Viewer.

The database is almost in rest and no where near 8 cores of full cpu load;

top - 13:39:29 up 703 days, 14:28, 5 users, load average: 0.20, 0.29, 0.32 Tasks: 414 total, 1 running, 413 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.2%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Regards, Ian Hoogeboom.

akardapolov commented 6 years ago

I see. The bug though.

Need to implement correction CPU used for WAIT_TIME = -1 as 1/100 of a second of cpu.. Am I right?

Thanks, Alex.

Airell commented 6 years ago

wait_time_vs_ash_viewer

Yes -1 is (less than) 1/100. This screenshot also shows. At least the -1 is a problem. I also think the (WAITED KNOWN TIME) state should not be reported. Session_state != 'WAITED KNOWN TIME' ?

There is also a -2 possible (not time wait available).

Kind regards, Ian Hoogeboom.