Closed zacharyDez closed 2 months ago
So, I could not reproduce the issue with 5-7 fields @andresfchamorro. Here's my quick and dirty benchmarking code:
def fetch_summary(fields, warm_up=False):
if warm_up:
# Perform a warm-up request with minimal payload
warm_up_payload = {
"aoi": aoi,
"spatial_join_method": "centroid",
"fields": ["sum_pop_2020"],
"geometry": "point"
}
requests.post(SUMMARY_ENDPOINT, json=warm_up_payload)
# Request payload with the specified fields
request_payload = {
"aoi": aoi,
"spatial_join_method": "centroid",
"fields": fields,
"geometry": "point"
}
response = requests.post(SUMMARY_ENDPOINT, json=request_payload)
if response.status_code != 200:
raise Exception(f"Failed to get summary: {response.text}")
return response.json()
# Benchmark function with timing and optional cold start delay
def benchmark(fields, delay_before_request=0):
if delay_before_request > 0:
time.sleep(delay_before_request) # Simulate a cold start by waiting
execution_time = timeit.timeit(lambda: fetch_summary(fields), number=1)
print(f"Time for {len(fields)} fields: {execution_time:.4f} seconds")
# Perform benchmarks with warm-ups and different delays for cold starts
for i in range(8):
try:
benchmark(available_fields[:i], delay_before_request=0) # Warm start
except Exception as e:
print(f"Error: {str(e)}")
print(f"Missed on: {i} with fields {available_fields[:i]}")
break
And the results:
Time for 0 fields: 4.2473 seconds
Time for 1 fields: 4.3757 seconds
Time for 2 fields: 4.4787 seconds
Time for 3 fields: 4.4958 seconds
Time for 4 fields: 4.6102 seconds
Time for 5 fields: 4.7435 seconds
Time for 6 fields: 4.7888 seconds
Time for 7 fields: 5.1760 seconds
Every run has variations, but the response is relatively constant at ~5 seconds per request, independently of field size.
It's possible that you requested data for a larger area (or something similar), which caused you to hit the size lambda size limits described in #35.
@andresfchamorro could you share the exact steps you used to reproduce the issue? I want to confirm whether this is a duplicate of #35 or its own separate bug.
@zacharyDez Interesting, I was still working out of the Kenya AOI. In the context of population, to me it makes sense that someone would want the full list of demographic variables.
As long as we document clearly what is the upper limit (< 10?), I don't see this as an issue. We can always point to ways of looping requests right?
@andresfchamorro; ok great. I'll close this issue as we have #35 and #37.
Describe the bug
@andresfchamorro raised a performance issue during our last call where the summary endpoint has performance issues with more than 5 fields.
To Reproduce
Expected behavior
Performance is linear