Closed flakey5 closed 12 months ago
@UlisesGascon I think you have a GitHub Action for testing paths?
We need to figure out a way to measure the performance of the worker to find out if anything needs to be improved.
There's a few tools from cloudflare we could investigate:
We could use the analytics engine to generate download graphs too.
Ran a test on my personal bucket for directory listing. Made ~330 folders with 27 files each for 8910+ paths in total and then listed it with the worker (not sure how these numbers compare to the actual dist folder).
Note: caching disabled
Rendering: <1ms R2 cold: 1075ms R2 hot: 800-830ms Total request duration: 1000-1400ms Us processing the R2 response: <1ms
Rendering: <1ms R2 cold: 140ms R2 hot: 90-100ms Total request duration: 143-170ms Us processing the R2 response: <1ms
Rendering: <1ms R2 cold: 800-1400ms R2 hot: 700-1000ms Us processing the R2 response: <1ms
Rendering: <1ms R2 cold: 100ms R2 hot: 80-100ms Us processing the R2 response: <1ms
The more directories and files (might include files within the subdirectories?) in the directory listing the longer the R2 call takes the respond. Of course, the local workerd instance needs to make a remote call to R2 and back from my machine which adds additional latency that would not be there in by a deployed worker.
So, maybe there's a way we can optimize our calls to R2?
So, maybe there's a way we can optimize our calls to R2?
So the biggest amount of processing time is coming from R2 calls?
Maybe the R2 calls we're doing are inefficient?
I think the slowness comes from large directories where wee need to call multiple times to get the entire list: https://github.com/nodejs/release-cloudflare-worker/blob/35a51b189263f6fef57a0548d4f8a57ae8b776c6/src/handlers/strategies/directoryListing.ts#L120-L141
my question is if this should concern us? the page should be served from cache on 99.99% of times
Do we have such large directories with more than 1000 entries?
Yeah after initial load with cache it is faster, but I would still appreciate if it was faster.
I feel that the bottleneck is definitely on the cursor of directory listing.
I wonder if there is a more efficient way of reading a directory contents
Yeah after initial load with cache it is faster, but I would still appreciate if it was faster.
+1, the time it takes uncached is okay but no great and still think it's worthwhile to look into to see if it can be improved
I add a log locally to the directory listing code. for https://r2.nodejs.org/dist/:
request 0 , objects: 2 , delimitedPrefixes: 17 , total objects: 0 , total delimitedPrefixes: 17
request 1 , objects: 102 , delimitedPrefixes: 55 , total objects: 2 , total delimitedPrefixes: 72
request 2 , objects: 0 , delimitedPrefixes: 31 , total objects: 104 , total delimitedPrefixes: 103
request 3 , objects: 0 , delimitedPrefixes: 30 , total objects: 104 , total delimitedPrefixes: 133
request 4 , objects: 0 , delimitedPrefixes: 61 , total objects: 104 , total delimitedPrefixes: 194
request 5 , objects: 0 , delimitedPrefixes: 37 , total objects: 104 , total delimitedPrefixes: 231
request 6 , objects: 0 , delimitedPrefixes: 34 , total objects: 104 , total delimitedPrefixes: 265
request 7 , objects: 0 , delimitedPrefixes: 23 , total objects: 104 , total delimitedPrefixes: 288
request 8 , objects: 0 , delimitedPrefixes: 22 , total objects: 104 , total delimitedPrefixes: 310
request 9 , objects: 0 , delimitedPrefixes: 21 , total objects: 104 , total delimitedPrefixes: 331
request 10 , objects: 0 , delimitedPrefixes: 20 , total objects: 104 , total delimitedPrefixes: 351
request 11 , objects: 0 , delimitedPrefixes: 18 , total objects: 104 , total delimitedPrefixes: 369
request 12 , objects: 0 , delimitedPrefixes: 20 , total objects: 104 , total delimitedPrefixes: 389
request 13 , objects: 0 , delimitedPrefixes: 19 , total objects: 104 , total delimitedPrefixes: 408
request 14 , objects: 0 , delimitedPrefixes: 18 , total objects: 104 , total delimitedPrefixes: 426
request 15 , objects: 0 , delimitedPrefixes: 19 , total objects: 104 , total delimitedPrefixes: 445
request 16 , objects: 0 , delimitedPrefixes: 19 , total objects: 104 , total delimitedPrefixes: 464
request 17 , objects: 0 , delimitedPrefixes: 11 , total objects: 104 , total delimitedPrefixes: 475
Although https://developers.cloudflare.com/r2/api/workers/workers-api-reference/#r2listoptions says each call returns 1000 items, it doesn't seem to. my guess is it recursivly scans files and returns when encountered 1000 items but that is just a guess
says each call returns 1000 items, it doesn't seem to.
Also tested and got similar results, will raise up to the R2 team
We need to figure out a way to measure the performance of the worker to find out if anything needs to be improved.
So far from my testing we're seeing similar response times when fetching a cached file from the worker as when we fetch a cached file from the origin server.
Sometimes though directory listing is noticeably slow (e.g. https://r2.nodejs.org/dist/ takes ~7sec uncached) which requires some further investigation.