Closed keegancsmith closed 2 months ago
[!WARNING] Test ownership level below 70% on this branch.
We aim to have ~70% of Bazel test targets to help us (dev-infra) and you get insight into the impact of tests on CI times. This is a non-blocking requirement, but keeping on top of test attribution to maintain this threshold is appreciated.
See the list of tag variable names to use here, and reach out to #discuss-dev-infra if you need to add/change this list.
See an example of how to use the variables here.
CI is legit failure. Will follow-up on a fix since looks like my "find references" failed to find some. But should still be reviewable.
Sorry if this is in your context link its coming across as broken for me.
To confirm, the main goal of this best effort v1, is that we've noticed that sending v1 telemetry has slowed down performance and that we are willing to "not wait" for v1 to send events, and whatever gets sent is sent. Would this be a fair picture of the goal? @keegancsmith
Sorry context broke after the rename. I will port this over to the new repo, but might as well discuss on here before doing that work. Here is the working link https://github.com/sourcegraph/sourcegraph-public-snapshot/pull/64481#issuecomment-2291526398
The context is this code path itself has been responsible for downtime on dotcom due to the high volume of writes (we think) conflicting with a migration trying to get a table lock. So in the other PR we are introducing a mechanism which applies backpressure on v1 store if there are too many concurrent writes. Given that, v1 writes can take a very long time and we don't want v1 not working to affect v2 writes.
I think that is a fair summary @bobheadxi? Does that make sense? If so, does this feel like a good approach?
Lets continue discussion in https://github.com/sourcegraph/sourcegraph/pull/53
In another PR addressing event_log tables being an issue, it was mentioned that we can just write to the legacy store without monitoring if it works or not. This commit implements that. Most of the code change is just prop drilling for the logger. We could simplify even further and not even log, but it feels like not being able to see the error anywhere would be hard to debug.
Test Plan: CI
Context: https://github.com/sourcegraph/sourcegraph/pull/64481#issuecomment-2291526398