cloudfoundry-incubator / admin-ui

Need new main contributor - An application for viewing Cloud Foundry metrics and operations data.
Apache License 2.0
71 stars 44 forks source link

Problem accessing data #189

Closed Akhilesh-Anb closed 5 years ago

Akhilesh-Anb commented 5 years ago

Hi,

I'm getting partial data in most of the tabs. Sometimes im getting the data after refresh and sometimes its missing.

Im getting following error for events tab: This page requires data from services that are currently unavailable

Please find my logs:

D, [2019-08-26T11:17:04.870829 #16] DEBUG -- : [ -- ] : [ -- ] : Select for key space_quota_definitions, table space_quota_definitions: SELECT `app_instance_limit`, `app_task_limit`, `created_at`, `guid`, `id`, `instance_memory_limit`, `memory_limit`, `name`, `non_basic_services_allowed`, `organization_id`, `total_reserved_route_ports`, `total_routes`, `total_services`, `total_service_keys`, `updated_at` FROM `space_quota_definitions`
D, [2019-08-26T11:17:04.871373 #16] DEBUG -- : [ -- ] : [ -- ] : Columns removed for key space_quota_definitions, table space_quota_definitions: []
D, [2019-08-26T11:17:04.871929 #16] DEBUG -- : [ -- ] : [ -- ] : Columns available, but not consumed for key space_quota_definitions, table space_quota_definitions: []
D, [2019-08-26T11:17:04.874783 #16] DEBUG -- : [ -- ] : [ -- ] : Caching CC space_quota_definitions data. Count: 2. Retrieval time: 1.085035405 seconds
D, [2019-08-26T11:17:04.874963 #16] DEBUG -- : [ -- ] : [ -- ] : [300 second interval] Starting CC spaces discovery...
D, [2019-08-26T11:17:04.882892 #16] DEBUG -- : [ -- ] : [ -- ] : Caching view model service_brokers data. Compilation time: 6.642494777 seconds
D, [2019-08-26T11:17:04.883510 #16] DEBUG -- : [ -- ] : [ -- ] : [150 second interval] Starting view model service_instances discovery...
D, [2019-08-26T11:17:04.997304 #16] DEBUG -- : [ -- ] : [ -- ] : Select for key spaces, table spaces: SELECT `allow_ssh`, `created_at`, `guid`, `id`, `isolation_segment_guid`, `name`, `organization_id`, `space_quota_definition_id`, `updated_at` FROM `spaces`
D, [2019-08-26T11:17:04.997743 #16] DEBUG -- : [ -- ] : [ -- ] : Columns removed for key spaces, table spaces: []
D, [2019-08-26T11:17:04.997792 #16] DEBUG -- : [ -- ] : [ -- ] : Columns available, but not consumed for key spaces, table spaces: []
I, [2019-08-26T11:17:05.563012 #16]  INFO -- : [ admin ] : [ get ] : /logs_view_model
D, [2019-08-26T11:17:05.597970 #16] DEBUG -- : [ -- ] : [ -- ] : Caching CC spaces data. Count: 421. Retrieval time: 0.722906135 seconds
D, [2019-08-26T11:17:05.608240 #16] DEBUG -- : [ -- ] : [ -- ] : [300 second interval] Starting CC spaces_auditors discovery...
D, [2019-08-26T11:17:05.627998 #16] DEBUG -- : [ -- ] : [ -- ] : Select for key spaces_auditors, table spaces_auditors: SELECT `spaces_auditors_pk`, `space_id`, `user_id` FROM `spaces_auditors`
D, [2019-08-26T11:17:05.628552 #16] DEBUG -- : [ -- ] : [ -- ] : Columns removed for key spaces_auditors, table spaces_auditors: []
D, [2019-08-26T11:17:05.628596 #16] DEBUG -- : [ -- ] : [ -- ] : Columns available, but not consumed for key spaces_auditors, table spaces_auditors: []

W, [2019-08-26T11:18:54.128271 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:d0b0390c-9e9f-4599-89e6-344e1cc79f55:10.xx.xxx.136 is not responding, its status will be checked again next refresh
W, [2019-08-26T11:18:54.129630 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:dc4353f2-2086-489e-bd6d-126de32cb0d4:10.xx.xxx.18 is not responding, its status will be checked again next refresh
W, [2019-08-26T11:18:54.130577 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:8a567467-ebdf-4e88-b423-f2a6ba9c6263:10.xx.xxx.30 is not responding, its status will be checked again next refresh
W, [2019-08-26T11:18:54.130691 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:80c785c9-ff8a-41ea-be4c-619864ee4a87:10.xx.xxx.43 is not responding, its status will be checked again next refresh
W, [2019-08-26T11:18:54.130767 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:2d9cff1d-5381-4d7f-825b-04c7eede6846:10.xx.xxx.45 is not responding, its status will be checked again next refresh
W, [2019-08-26T11:18:54.130834 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:1d4148d7-fd72-408c-bdee-79166b43d6bc:10.xx.xxx.85 is not responding, its status will be checked again next refresh

I have provided proper credentials of NATS, UAADB and CCDB. But not sure why im still getting issues. Could you please help me on this.

Thanks, Akhilesh Appana

rboykin commented 5 years ago

@Akhilesh-Anb I don't see anything in your logfile snippet regarding the Events tab. I see warnings for the grootfs component. This likely means that the doppler firehose client was able to get these at one time, but subsequently they were no longer available. This grootfs component will only be seen in the Components tab. Are you sure you are seeing problems in the events tab?

If the doppler-reported components are not being found, it is possible that the admin-ui-client you created is no longer able to access the doppler firehose or the configuration value doppler_rollup_interval should be increased.

Akhilesh-Anb commented 5 years ago

Hi,

I tried increasing the value of doppler_rollup_interval. But it didnt help me. After repushing the application, im not able to get the data in other tabs as wells. Im getting data only under App Instances, Quotas, Buildpacks, domains.

image

Getting partial data under organizations tab: image

Log:

W, [2019-08-27T05:32:31.614854 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:e926d44c-97c2-4763-9bbc-525c1765aa47:10.xx.xxx.62 is still not responding
W, [2019-08-27T05:32:31.614978 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:4979e185-2464-4f9a-8b2b-fbdc6d026fcc:10.xx.xxx.156 has been recognized as disconnected
W, [2019-08-27T05:32:31.615047 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:16f919bd-29a8-412a-99df-38821d3f49aa:10.xx.xxx.119 has been recognized as disconnected
W, [2019-08-27T05:32:31.615408 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:b0a83248-b821-4785-bd4a-c30cd1034420:10.xx.xxx.68 has been recognized as disconnected
W, [2019-08-27T05:32:31.615883 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:c424d550-cf61-4ae3-8213-de6ffaae2df9:10.xx.xxx.41 is not responding, its status will be checked again next refresh
W, [2019-08-27T05:51:42.357612 #16]  WARN -- : [ -- ] : [ -- ] : The loggregator.metron component loggregator.metron:cb576902-18ec-4289-ad05-455994dd5226:10.xx.xxx.142 has been recognized as disconnected
W, [2019-08-27T05:51:42.357732 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:9825eb4a-2de2-43c2-ba37-1a683827911b:10.92.205.50 has been recognized as disconnected
W, [2019-08-27T05:51:42.357797 #16]  WARN -- : [ -- ] : [ -- ] : The loggregator.metron component loggregator.metron:51595c33-31aa-4a95-b990-2baed67f2a29:10.xx.xxx.127 has been recognized as disconnected
W, [2019-08-27T05:51:42.357857 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:2f7fad19-bcfd-48c4-9601-cbd5957c1c16:10.xx.xxx.42 has been recognized as disconnected
W, [2019-08-27T05:51:42.357917 #16]  WARN -- : [ -- ] : [ -- ] : The grootfs component grootfs:eb0fb025-7c43-4a39-8e29-1a475726f388:10.xx.xxx.70 has been recognized as disconnected
W, [2019-08-27T05:51:42.357977 #16]  WARN -- : [ -- ] : [ -- ] : The bosh-system-metrics-forwarder component bosh-system-metrics-forwarder:a6a321f4-b468-4251-b1ef-01d90bc955d7: is not responding, its status will be checked again next refresh

I searched these components under components tab and i see all those are in running state.

rboykin commented 5 years ago

@Akhilesh-Anb

Thanks for the screen shots.

What level of CF are you running? You can see this in the top of the admin ui screen. Unfortunately, just cut off by your screen shots.

Interesting that there are no errors in your admin-ui.log, just warnings.

For the Apps tab, the code requires CCDB.apps, CCDB.droplets and CCDB.packages in order to show anything. Other tables will be used as available. This makes me wonder if your connection to the CCDB is being limited somehow. Perhaps number of connections at the CCDB server side is being limited.

For the Orgs tab, the code requires CCDB.organizations. Other tables will be used as available.

I have also seen time differences which can cause the WARN's you show above because the doppler bosh VM is running with a time-in-the-past and the admin ui is running locally with a current time. The doppler-reading logic in the admin ui depends on time sync between the admin ui server and the doppler_logging_endpoint.

Akhilesh-Anb commented 5 years ago

Hi,

Please find the version below: image

After accessing the application, i dont see data in almost tabs and partial data in few tabs. But if i leave the application open for 1 or 2 hours, i see complete data is available in all the tabs.

When i refresh the application again, all the data is missing.

Please check the screenshot below. I was getting partial data before for organizations tab. Now when i open the application for 1 or 2 hours i can see the data as shown below.

image

NOTE: Whenever im refreshing the application, my data is missing. its taking a lot of time to fetch the data.

Please let me know if there is something i can do to solve this issue.

rboykin commented 5 years ago

@Akhilesh-Anb

Check the following configuration values:

cloud_controller_discovery_interval
doppler_rollup_interval
nats_discovery_interval
varz_discovery_interval

These control how often the admin ui tries to access data. If you have configured these to large values, you could get a 1-2 hour lag.

These config items are documented in the https://github.com/cloudfoundry-incubator/admin-ui/blob/master/README.md

Also, note that the admin ui retrieves the data from the backend CCDB, UAADB, NATS, Varz, and Doppler firehose and combines prior to making that combined data available to the UI. It you have a lot of data to be retrieved, it can take some time. If you have something like a local bosh-lite, it can be very quick.

Realize that if by refreshing the application, you mean doing a cf push, that this restarts the application and it starts the backend data retrieval anew.

Akhilesh-Anb commented 5 years ago

Hi,

I used the default configuration and pushed it.

I mean refreshing the application in chrome not by repushing it. Ideally it takes time to retrieve the data for the first time when we push it and after that it should provide us the results.

But in my case without repushing it, when i just refresh the application in chrome, my data is missing in few tabs. When i left opening that application in chrome for 1 hour, im getting all the data. Ideally it shouldn't happen but this is what happening to me.

Latest Screenshot: I can see that doppler things are in offline state. image

Thanks, Akhilesh Appana

rboykin commented 5 years ago

@Akhilesh-Anb

For the offline doppler items, you might increase the doppler_rollup_interval. If the admin ui is not re-notified of the component in doppler_rollup_interval*4 seconds, then the component is considered as offline. Default is 30 seconds. In my bosh-lite test environment, I changed the doppler_rollup_interval to 60 seconds. I also changed it to a larger value in a production environment.

Regarding the others empty tabs, all I can suggest is to look for errors in the admin-ui.log or in the web browser itself. If the CCDB/UAADB access is slow or limited, then that could cause the problems in the other tabs.

There are also a bunch of tabs that likely won't have any records unless you use some of their resources in your CF environment.

rboykin commented 5 years ago

@Akhilesh-Anb

If you are running this as a CF app, you also might increase your memory allocation for the admin-ui app.

Akhilesh-Anb commented 5 years ago

Hi,

I tried increasing the memory. I'm getting the data now except in Events Tab. I dont see any error in log related to EVENTS tab.

rboykin commented 5 years ago

@Akhilesh-Anb There are so many events in a CF system of any significant size. The default admin configuration limits these to the last 7 days: https://github.com/cloudfoundry-incubator/admin-ui/blob/master/config/default.yml#L14.

Try fewer than 7 days and see if this solves your problem. It is also possible that the events retrieval is just taking longer due to the magnitude of the records.

rboykin commented 5 years ago

I am closing this due to the memory increase largely solving your problem.