Azure / azure-cosmos-dotnet-v3

.NET SDK for Azure Cosmos DB for the core SQL API
MIT License
736 stars 491 forks source link

CFP AVAD: Need APIs to validate partitionkeys and feedranges against a list of ranges. #4483

Open philipthomas-MSFT opened 4 months ago

philipthomas-MSFT commented 4 months ago

Description

Stakeholders

Problem statement

Customer requires a means to find a feed range and/or partition key against a list of given feed ranges. The purpose of this is to "bookmark", (FeedRange(minInclusive, maxExclusive), LSN) document changes so that the customer can validate if the "bookmark" of the document change has been processed on previous feed iterations. Each feed iteration has a feed range, partition key and LSN returned as part of the ChangeFeedProcessorContext type or the changed document. The addition of feed range is included here. Validation needs to do by both partition key and feed range.

PR is located at 4566 exposed 2 new API methods IsSubset with a parent feed range by either partition key or child feed range.

"Bookmark" sample:

[
    {
        "Range": {
            "min": "",
            "max": "05C1DFFFFFFFFC"
        },
        "LSN": "0"
    }
]

playfab_flow

Original rough information

{
    "id": 1,
    "billing": [],
    "bookmarks": [
      { feedRange: "''-'AA'", lsn: 5 },
      ....
    ]
}

Changes[] to process
Filter for TitleId 1

Sort changes of TitleId1 by LSN and identify highest/last LSN --> 7

Add bookmark for FeedRange of current physical partiton + highestLSNForTitelId1

{
    "id": 1,
    "billing": [],
    "bookmarks": [
      { feedRange: "''-'AA'", lsn: 5 },
      { feedRange: "'0A'-'AA'", lsn: 7 },
      --> '' - '0A': 5
    ]
}

------------

Check for exactly once processing
Find Changes by TitleId1
Read Title doc for TitleId1
For each change {
Identify full PK
Check whether LSN of change with PK xyz has been processed yet
}

boolean hasChangeBeenProcessed(PartitionKey pk, long lsnOfChange, List[<FeedRange, long> titleBookamrks) {
--> lsnOfChange 9
--> PK has EPK '0B' --> changes processed until LSN 7

return lsnOfChange <= highestOverlappingLsn
}

Task<List[FeedRange]> FindOverlappingRangesAsync (PartitioKey pk, List{FeedRange])
Task<List[FeedRange]> FindOverlappingRangesAsync (FeedRange pk, List{FeedRange])