chef / automate

Chef Automate provides a full suite of enterprise capabilities for maintaining continuous visibility into application, infrastructure, and security automation.
https://automate.chef.io/
Apache License 2.0
227 stars 113 forks source link

The reporting list-nodes api endpoint is throwing GRPC errors well before reaching the 10k nodes-per-page limit #5040

Open chef-davin opened 3 years ago

chef-davin commented 3 years ago

Describe the bug

https://docs.chef.io/automate/api/#operation/ReportingService_ListNodes says that you should be able to return up to 10,000 nodes per page in the results of your api query. However, we're seeing the following error when trying to get results for more than 5-7,000 nodes (depending on environment):

> /usr/bin/curl -s -H "api-token: XXXXXXXXX" https://automate.example.com/api/v0/compliance/reporting/nodes/search -H "Content-Type: application/json" -X POST --data '{"filters": [{"type": "","values": [""]}],"id": "","order": "ASC","page": 1,"per_page": 8000,"sort": "","type": ""}'

{"error":"grpc: received message larger than max (4471820 vs. 4194304)","code":8,"message":"grpc: received message larger than max (4471820 vs. 4194304)","details":[]}

To Reproduce

Steps to reproduce the behavior:

It's important to note that you need to actually have the number of compliance scan results you're trying to pull. If you set the page size to 10000 nodes-per-page, and you only have 1 node, this error won't come up. You need to actually have the 10k node compliance scan results on your automate server to replicate this error (hence using chef-load).

Expected behavior

This should return a JSON list of node objects with their compliance reporting status. It would look something like this (though much more than one node being reported):

> /usr/bin/curl -s -H "api-token: XXXXXXXXXXXXXXXXXXX" https://automate.example.com/api/v0/compliance/reporting/nodes/search -H "Content-Type: application/json" -X POST --data '{"filters": [{"type": "","values": [""]}],"id": "","order": "ASC","page": 1,"per_page": 10000,"sort": "","type": ""}'

{"nodes":[{"id":"d4dba4ee-feff-4f43-8838-ef922da33e64","name":"automate-server","platform":{"name":"centos","release":"7.7.1908","full":"centos 7.7.1908"},"environment":"automate_server","latest_report":{"id":"9666917c-3d48-4795-9167-5f138d588b91","end_time":"2021-05-04T19:06:48Z","status":"failed","controls":{"total":217,"passed":{"total":141},"skipped":{"total":1},"failed":{"total":73,"minor":1,"major":0,"critical":72},"waived":{"total":2}}},"tags":[],"profiles":[{"name":"cis-centos7-level1","version":"1.1.0-5","id":"383a52cfd11f1cf1e9ee2e54ad7b05a810e20e903d550d8f9e15550b5cb6464b","status":"failed","full":"CIS CentOS Linux 7 Benchmark Level 1, v1.1.0-5"},{"name":"linux-baseline","version":"2.2.2","id":"477f53f8f6867a0f1abe7c199a569d00c9d38d8f2a8b85dbb9cc361ca435a2b6","status":"failed","full":"DevSec Linux Security Baseline, v2.2.2"}]}],"total":1,"total_passed":0,"total_failed":1,"total_skipped":0,"total_waived":0}

If we are currently limiting the output of API queries in the API to 4MB, it would be nice if this was a configurable setting so that users could adjust that based on their needs in querying the API.

Versions (please complete the following information):

Aha! Link: https://chef.aha.io/epics/SH-E-508

timdsmith72 commented 3 years ago

I'm running into this too with this API call.
/api/v0/cfgmgmt/nodes?pagination.page=1&pagination.size=4000' Anything over pagination.size of ~3000 results in the same grpc error.

Version: 2 Build: 20210504084406