Closed SyntaxRules-avant closed 6 months ago
@SyntaxRules-avant how large is large for you?
thanks for bumping @zenspider I lost track of this one...
I have seen on some very large monoliths the coverage page slow down to load, but since we weren't on heroku or with a forced timeout I haven't worried to much about it...
I am guessing how we load the coverage data is a N+1 redis hit for the admin view... I would likely need to break it into a process to page through the data and load it in smaller batches... This would take a bit of a refactor but would solve the issue.
In this case since it is the admin reporting view config.use_oneshot_lines_coverage = true
will have no impact on the admin performance.
@zenspider large in this case is ~2 million lines of code. Hitting the coverband webpage takes ~3mins.
@danmayer I like the approach you have described. Decoupling the initial page load and loading in the data in smaller batches would do it.
@SyntaxRules-avant do you have an estimate on the number of files. The N+1 is more per file than per LOC.
It is very slow for me too, and at some point it gets so slow that it times out, and we need to reset the stats.
In my case we are talking about "4491 files in total. 116074 relevant lines". The views page times out, so I don't know how many.
OK, so 4491 files isn't that many... but I could see things timing out in 30s especially on a slower redis...
I did verify that this is currently and N+1 lookup as the core of getting the coverage data is
files_set = files_set(local_type)
@redis.pipelined { |pipeline|
files_set.each do |key|
pipeline.hgetall(key)
end
}
so I will need to re-work a number of things to allow it to page through all the data to display the results and to calculate the final coverage percentage stats across the batches.
@danmayer Thanks for the extra details. The app I am running is "13,070 files in total. 446,533 relevant lines."
note that two improvements landed in 6.0.2 that should help folks that were seeing slowness on reporting thanks to @makicamel folks can give it a try and let us know how it helps. Note one of them you have to opt into as it isn't on by default.
@danmayer I updated our staging environment to 6.0.2, and added the get_coverage_cache: true
option to the HashRedisStore
adapter, but the page still times out.
OK, folks I am finally making some progress on this... I have a small reproducible benchmark that shows the slowdown as additional files are added to a project. I will use it to drive some improvements and suggestions for folks.
OK, found some interesting things... while the hashredis store is faster in some cases and removes a rare race condition on the traditional redis store... It is far less performant with a large number of files when building the web report...
With a project that has 4500 Ruby files with coverage the traditional store is twice as fast reporting.
> COVERBAND_HASH_STORE=true COVERBAND_DISABLE_AUTO_START=true BENCHMARK_COVERBAND=true bundle exec rake coverband_benchmark
using hash store
1.042208 0.148724 1.190932 ( 3.672132)
> COVERBAND_DISABLE_AUTO_START=true BENCHMARK_COVERBAND=true bundle exec rake cover
band_benchmark
0.729637 0.135307 0.864944 ( 0.988047)
OK, folks while there are likely a few options to improve the performance since the HashRedis algo stores and pulls data via a key per file, even with pipelining it can be slow for large projects. I was going to setup an optional reporter that pages through the data, this has some disadvantages like not getting the full project coverage summary, but it should also be an easy way to get things usable for large projects.
Here is a video showing how it would work, let me know if this approach would work for you all @frsantos @SyntaxRules-avant @kamal
https://github.com/danmayer/coverband/assets/24925/5df32cfa-7064-49eb-9352-c0276defd875
Also if someone can give me some latency stats around their Redis calls that would be helpful. I am able to add arbitrary latency to my Redis benchmarks... but basing them off some real latency numbers would help me optimize this a bit more.
thanks
Thanks @danmayer
OK, folks while there are likely a few options to improve the performance since the HashRedis algo stores and pulls data via a key per file, even with pipelining it can be slow for large projects. I was going to setup an optional reporter that pages through the data, this has some disadvantages like not getting the full project coverage summary, but it should also be an easy way to get things usable for large projects.
The coverage summary would be useful but I think we can live without it. However, I see two problems with the paged approach: sorting and searching. Are those going to be available without a performance hit? Without those, having a list of thousands of elements is not usable at all.
Also if someone can give me some latency stats around their Redis calls that would be helpful. I am able to add arbitrary latency to my Redis benchmarks... but basing them off some real latency numbers would help me optimize this a bit more.
Not sure what redis call we are talking about here, but for me a GET call could range from 0.7ms to 21ms, with an average of 2ms, and a MGET could range from 1ms to 70ms, with an average of 4.5ms
OK, yeah It requires some extra work but I could make the search work across the paging. I don't think I can' do the normal table sorting as that requires the coverage data...
Alternatively, I could not change the search or filter in that they will work one everything loaded on the page, and I can have the summary and the file list start at 0 and grow as data is loaded.... so you could search and sort but it would only do so for the data that got loaded...
thanks the redis metrics will help me make my benchmarks match the situation you are seeing more closely.
The main feature we use is file search for checking if a class or method has been really used. The sorting is a nice tool to detect code no longer used. The summary is bells and whistles, but does not really help us to detect dead code.
Also, a big problem is not just the table, but the file detail. A 4 lines file takes many seconds to load, so even if the search can find it, it's very slow to show it. Right now it takes us 20s to show a 4 lines file.
Paging would work fine for me as well. My use cases are similar- I'm mostly looking for dead files & functions. This seems to be a key benefit for larger code bases.
I am not using HashRedisStore
, but my current version of coverband is 5.2.5. I'll work on getting it updated.
yeah at the moment loading a single file will still load the entire coverage report and then grab the one file... so we could definitely improve the way we load single files...
interesting @SyntaxRules-avant in some sense the original one scales a bit better as it doesn't have an N+1 problem, but it can also become a very large coverage blob... I do think adding gzip compression and possibly packaging as msgpack vs json could help with the more traditional redis store... I can play around with that some more.
ok @frsantos I have a big win on loading single files with the hash redis store, while the N+1 is problematic when building a full report... It made it easy to have the load file details do a single small redis lookup for a specific file... Benchmark running on a project with 4500 files.
# old implementation
COVERBAND_HASH_STORE=true BENCHMARK_COVERBAND=true bundle exec rake coverband_benchmark_single_file
0.611703 0.049188 0.660891 ( 1.928851)
# new single file lookup
COVERBAND_HASH_STORE=true BENCHMARK_COVERBAND=true bundle exec rake coverband_benchmark_single_file
0.009707 0.000001 0.009708 ( 0.062045)
This is on my paging branch so not merging / releasing it yet.
OK, my branch now has data table paging via server side data with paging... I need to get it to properly pass the search term, and I have disabled the sorting feature... but I think this will get us to a good spot.
OK, closer, but even with background paging, the dom and datatable gets overwhelmed when I page in over 2K records... so now with the basics working, I am thinking I can add server side search, and look at how to better optimize or leverage the UI paging.
Oh nice fixed the perf issue with large row count... Looks good tested up to 7K files and it was still responding snappy. OK just need to clean up and cover a weird edge case with paging, and I should be able to get a release candidate out for folks to test.
OK @frsantos and @SyntaxRules-avant I have a release candidate that you all can try out to see if it lets you load your projects
6.0.3.rc.1
OK, folks can try this out it is in the released 6.1.0 main version but you still need to use the config.paged_reporting = true
option to enable it... Skipping paged reporting is faster for me by far, but on projects where the time exceeds 30 seconds on a heroku the paging option should allow for projects of any size to load in the background of a tab. thanks so much for all the feedback, support, and questions. Let me know if folks see other issues.
a few fixes and improvements have been released so 6.1.2 should work for folks.
Is your feature request related to a problem? Please describe. I'm trying coverband on a very large application. To load the coverage page takes 2-3 mins. This is a little frustrating, it would be better faster.
Some hosting providers, like heroku, limit any one request to less than 30 seconds. When my app is loaded on infrastructure like that the coverage page is unusable due to timeouts.
Describe the solution you'd like Possible ways to speed up the page load:
Describe alternatives you've considered
Context
I have enabled
config.use_oneshot_lines_coverage = true
in an effort to make things faster. I am using Ruby 2.6.6.