cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.92k stars 3.78k forks source link

kvcoord: add `TxnCoordSender` method to refresh read spans #68051

Open erikgrinaker opened 3 years ago

erikgrinaker commented 3 years ago

To implement the Index Lookups Memory Limits and Parallelism RFC, the streamer will have to use leaf transactions to run parallel requests (as a root TxnCoordSender does not support this).

This will prevent the use of automatic span refreshing, instead propagating these errors to the client who will have to retry the transaction. We can avoid this by adding a method to the TxnCoordSender that, given a refresh timestamp, explicitly refreshes the read spans and handles errors as appropriate. The streamer will then coordinate in-flight requests, update the root txn with leaf state via UpdateRootWithLeafFinalState, and explicitly submit a refresh. The method should ensure we return appropriate info in any errors propagated to clients (i.e. the original keys that caused the refresh, as well as which keys caused the refresh to fail).

For more details, see the Hiding ReadWithinUncertaintyInterval errors section in the RFC.

/cc @cockroachdb/kv

nvanbenschoten commented 3 years ago

This is related to #24798.

erikgrinaker commented 2 years ago

@andreimatei So it turns out we already have an (unused) method TxnCoordSender.ManualRefresh() that does this:

https://github.com/cockroachdb/cockroach/blob/efefd3da263b8cc22093618f436a4d52c64261f9/pkg/kv/kvclient/kvcoord/txn_coord_sender.go#L1276-L1298

Since the streamer library will be calling TxnCoordSender.UpdateRootWithLeafFinalState to update the txn state, which would update Txn.WriteTimestamp if it's set, this may already be sufficient (with a few small tweaks to handle leaf txns, i.e. canAutoRetry). Or we could modify it to take in the refresh timestamp rather than reading it from Txn.WriteTimestamp. Am I missing anything?

andreimatei commented 2 years ago

What you say makes sense to me.

Or we could modify it to take in the refresh timestamp rather than reading it from Txn.WriteTimestamp.

Taking the timestamp to refresh to as an argument sounds good; if we don't need it, even better I'd say.

erikgrinaker commented 2 years ago

I've looked this over, and from what I can tell we have the main pieces we need already in place. I think the streamer would do something like this:

br, pErr := leafTxn.Send(ctx, ba)
if pErr != nil {
    canRefreshTxn, refreshTxn := roachpb.CanTransactionRefresh(ctx, pErr)
    if !canRefreshTxn {
        return pErr.GoError()
    }
    rootTxn.Sender().UpdateRootWithLeafFinalState(ctx, &roachpb.LeafTxnFinalState{
        Txn: *refreshTxn,
        RefreshSpans: nil, // set appropriate spans
    })
    if err := rootTxn.ManualRefresh(); err != nil {
        return err
    }
}

As such, I'm going to close this for now, but please reopen (or open a new issue) if you find that this isn't sufficient @yuzefovich.

yuzefovich commented 2 years ago

Hey @erikgrinaker, I just rebased my prototype for this on top of latest master, and roachpb.CanTransactionRefresh is gone, and I can't seem to find its replacement 😅 Could you please point me to it?

yuzefovich commented 2 years ago

Never mind, I found that it was refactored in #73557.

yuzefovich commented 2 years ago

@erikgrinaker I still need your help after all 😄

My current prototype is something like

    br, pErr := leafTxn.Send(ctx, ba)
    if pErr != nil {
        canRefreshTxn, refreshTS := roachpb.TransactionRefreshTimestamp(pErr)
        if !canRefreshTxn {
            return pErr.GoError()
        }

        leafTxnFinalState, err := leafTxn.GetLeafTxnFinalState(ctx)
        if err != nil {
            return err
        }
        refreshTxn := pErr.GetTxn()
        refreshTxn.Refresh(refreshTS)
        leafTxnFinalState.Txn = *refreshTxn
        rootTxn.Sender().UpdateRootWithLeafFinalState(ctx, leafTxnFinalState)
        if err := rootTxn.ManualRefresh(ctx); err != nil {
            return err
        }
    }

Does this look right? What happens to the old leafTxn is the refresh succeeds? Do I have to create a new leafTxn?

erikgrinaker commented 2 years ago

Hm, I'm not sure if that's going to work. Because refreshTxn.Refresh() forwards the read and write timestamps:

https://github.com/cockroachdb/cockroach/blob/01cb84707dd8a739c7b8aa0587383e8dac71507d/pkg/roachpb/data.go#L1135-L1139

But then ManualRefresh() is a noop if these timestamps are equal:

https://github.com/cockroachdb/cockroach/blob/52c2df4fb00d943328dc59df9dc210a40a556ab2/pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go#L441-L444

@nvanbenschoten Any thoughts on how to accomplish the above following #73557? Do we need to update ManualRefresh() or wire up a new method for this?

nvanbenschoten commented 2 years ago

This is interesting. I agree that refreshTxn.Refresh() is not the right thing to do for the reason you mentioned. Is it needed though? What would happen if we just did the following? Won't the leafTxn already be updated with an increased WriteTimestamp that reflects the pErr by the time we hear about it?

    br, pErr := c.Send(ctx, ba)
    if pErr != nil {
        canRefreshTxn, refreshTS := roachpb.TransactionRefreshTimestamp(pErr)
        if !canRefreshTxn {
            return pErr.GoError()
        }

        leafTxnFinalState, err := leafTxn.GetLeafTxnFinalState(ctx)
        if err != nil {
            return err
        }
        rootTxn.Sender().UpdateRootWithLeafFinalState(ctx, leafTxnFinalState)
        if err := rootTxn.ManualRefresh(ctx); err != nil {
            return err
        }
    }
yuzefovich commented 2 years ago

Is it on me to try the latest proposal from Nathan and see if it works? Or are you still thinking about how to achieve the manual refresh?

erikgrinaker commented 2 years ago

Sorry, this fell through the cracks. I would go with Nathan's suggestion for now, and I'll look this over real quick when I have a chance.

erikgrinaker commented 2 years ago

Yeah, I had a quick look, and I think that should work -- the leaf txn should have its WriteTimestamp bumped by the error, which gets propagated to the root txn here and then refreshed. But it's definitely worth writing an integration test for this.

yuzefovich commented 2 years ago

@erikgrinaker @nvanbenschoten Apologies for having to page the context back in, but I finally got to working on this, and maybe the solution proposed by Nathan doesn't quite work (or at least I'm misusing it).

In https://github.com/yuzefovich/cockroach/tree/streamer-refresh in the last two commits I'm introducing a test for this behavior as well as a fix (which is currently incomplete, but I don't think it's important). With the snippet similar to https://github.com/cockroachdb/cockroach/issues/68051#issuecomment-1013324707, when trying to re-execute the BatchRequest, I'm getting the same injected RWUI error coming from here. It seems as if the TxnCoordSender of the leafTxn is poisoned and ManualRefresh didn't fix it.

Is it the problem with the test where I inject RWUI error and it is not reflected in the refresh spans of LeafTxnFinalState? Or should I be creating a new LeafTxn once the root txn is refreshed?

nvanbenschoten commented 2 years ago

when trying to re-execute the BatchRequest, I'm getting the same injected RWUI error coming from here

Are you creating a new leaf transaction after the manual refresh? Once a leaf transaction has hit this error, I think it needs to be ingested into the root txn (through UpdateRootWithLeafFinalState) and then discarded.

Also, could you try changing this line from roachpb.NewError(...) to roachpb.NewErrorWithTxn(..., ba.Txn)?

yuzefovich commented 2 years ago

Are you creating a new leaf transaction after the manual refresh?

No, I wasn't creating a new leaf and was using the old one. After creating a fresh LeafTxn the retried BatchRequest goes through, but it still has the old timestamp which goes against my intuition. It looks like we probably should not be just ignoring refreshTS value from the snippet - how should we incorporate it? My expectation is that after ManualRefresh and creating a new LeafTxn, the new LeafTxn should be using refreshTS value as its ReadTimestamp, but it is currently not the case.

Also, could you try changing this line from roachpb.NewError(...) to roachpb.NewErrorWithTxn(..., ba.Txn)?

This doesn't seem to change anything (neither w/ nor w/o re-creating a leaf txn).

nvanbenschoten commented 2 years ago

but it still has the old timestamp which goes against my intuition

This goes against my intuition as well. I think it has to do with the NewErrorWithTxn. In cases where a transaction's timestamp is bumped due to a tscache conflict (write-read conflict), a closed timestamp conflict (write-read conflict), a write-too-old conflict (write-write conflict), and most others, the returned Txn already has its WriteTimestamp bumped. That will cause TxnCoordSender.updateStateLocked to bump the coordinator's timestamp.

We're not doing this in the test, so I think that explains things. However, I'm not sure that we do bump the write timestamp on a ReadWithinUncertaintyInterval error before performing the rest of the refresh, because a txn can't proceed with evaluation until the error is refreshed away. So the error does not carry a provisional updated Txn.

With that in mind, it does feel like we should be using the refreshTS straight from TransactionRefreshTimestamp and feeding that into a new ManualRefreshTo(context.Context, hlc.Timestamp) method which first forwards the txn's WriteTimestamp to the provided timestamp before performing the refresh.

yuzefovich commented 2 years ago

I see, I think this makes sense, thanks Nathan. It almost worked right away - I started getting

TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh - unknown error: <nil>): "unnamed" meta={id=9e4d6b4c key=/Min pri=0.00941665 epo=0 ts=1653076575.760907001,1 min=1653076575.760907000,0 seq=0} lock=false stat=PENDING rts=1653076575.760907000,0 wto=false gul=1653076576.260907000,0

during ManualRefreshTo. Apparently, txnSpanRefresher.refreshedTimestamp was empty causing this error during a refresh, so for now I added a bump to that timestamp in pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go. Could you please take a quick look at this snippet?

nvanbenschoten commented 2 years ago

I don't think that snippet is quite right. The handling of txnSpanRefresher.refreshedTimestamp is pretty funky so I see why you wrote the code that way. However, as written, I think you're bumping the refreshTxn's ReadTimestamp to the write timestamp and then using this to set refreshedTimestamp. So the refresh would be a no-op in this case.

Let me see whether I can clean up this handling of txnSpanRefresher.refreshedTimestamp so that you don't need to deal with this.

nvanbenschoten commented 2 years ago

https://github.com/cockroachdb/cockroach/pull/82649 should avoid the need to hack around the empty txnSpanRefresher.refreshedTimestamp.

github-actions[bot] commented 9 months ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to CockroachDB!