Closed girarda closed 1 year ago
grooming notes:
https://www.loom.com/share/4d8dd2388e4d4a30a9f912a2a57a16df
AIPredictionEvent
AccountChangeEvent
AccountContactRoleChangeEvent
ActivityHistory
AggregateResult
ApiAnomalyEvent
ApiEventStream
AsgnRsrcApptSchdEvent
AssetChangeEvent
AssetTokenEvent
AssignedResourceChangeEvent
AsyncOperationEvent
AsyncOperationStatus
AttachedContentDocument
AuthorizationFormConsentChangeEvent
BatchApexErrorEvent
BriefcaseAssignmentChangeEvent
BriefcaseDefinitionChangeEvent
BulkApiResultEvent
CampaignChangeEvent
CampaignMemberChangeEvent
CampaignMemberStatusChangeEvent
CaseChangeEvent
CombinedAttachment
CommSubscriptionConsentChangeEvent
ConcurLongRunApexErrEvent
ContactChangeEvent
ContactPointAddressChangeEvent
ContactPointConsentChangeEvent
ContactPointEmailChangeEvent
ContactPointPhoneChangeEvent
ContactPointTypeConsentChangeEvent
ContentBody
ContentDocumentChangeEvent
ContentDocumentLinkChangeEvent
ContentVersionChangeEvent
ContractChangeEvent
ContractLineItemChangeEvent
CredentialStuffingEvent
DataObjectDataChgEvent
EmailMessageChangeEvent
EmailStatus
EmailTemplateChangeEvent
EntitlementChangeEvent
EventChangeEvent
EventRelationChangeEvent
EventRelayConfigChangeEvent
FeedLike
FeedSignal
FeedTrackedChange
FileEvent
FinanceBalanceSnapshotChangeEvent
FinanceTransactionChangeEvent
FlowExecutionErrorEvent
FlowOrchestrationEvent
FolderedContentDocument
IndividualChangeEvent
LeadChangeEvent
LightningUriEventStream
ListEmailChangeEvent
ListViewEventStream
LiveChatTranscriptChangeEvent
LocationChangeEvent
LoginAsEventStream
LoginEventStream
LogoutEventStream
LookedUpFromActivity
MacroChangeEvent
MacroInstructionChangeEvent
Name
NoteAndAttachment
OpenActivity
OperatingHoursChangeEvent
OpportunityChangeEvent
OpportunityContactRoleChangeEvent
OrderChangeEvent
OrderItemChangeEvent
OrgLifecycleNotification
OwnedContentDocument
PartyConsentChangeEvent
PendingOrdSumProcEvent
PendingOrderSummaryChangeEvent
PermissionSetEvent
PlatformStatusAlertEvent
Pricebook2ChangeEvent
PricebookEntryChangeEvent
ProcessExceptionEvent
ProcessInstanceHistory
Product2ChangeEvent
ProductAttributeChangeEvent
ProductCatalogChangeEvent
ProductCategoryChangeEvent
ProductCategoryProductChangeEvent
QuickTextChangeEvent
QuoteTemplateRichTextData
RecommendationChangeEvent
RemoteKeyCalloutEvent
ReportAnomalyEvent
ReportEventStream
ResourceAbsenceChangeEvent
ResourcePreferenceChangeEvent
ReturnOrderChangeEvent
ReturnOrderLineItemChangeEvent
ServiceAppointmentChangeEvent
ServiceContractChangeEvent
ServiceResourceChangeEvent
ServiceResourceSkillChangeEvent
ServiceTerritoryChangeEvent
ServiceTerritoryMemberChangeEvent
SessionHijackingEvent
ShiftChangeEvent
SkillChangeEvent
SkillRequirementChangeEvent
SvcApptSchdEvent
TaskChangeEvent
Test_Custom_Object__ChangeEvent
TimeSlotChangeEvent
UriEventStream
UserChangeEvent
WebStoreBuyerGroupChangeEvent
WebStoreCatalogChangeEvent
WebStoreChangeEvent
WorkTypeChangeEvent
data__ChangeEvent
pi__AsyncRequest_Settings__ChangeEvent
pi__AsyncRequest__ChangeEvent
pi__Category_Contact_Score__ChangeEvent
pi__Category_Lead_Score__ChangeEvent
pi__Demo_Settings__ChangeEvent
pi__EngageCampaignRecipient__ChangeEvent
pi__LDFilter__ChangeEvent
pi__ObjectChangeLog__ChangeEvent
pi__PardotTask__ChangeEvent
pi__Pardot_Scoring_Category__ChangeEvent
pi__Partner_Settings__ChangeEvent
pi__Trigger_Settings__ChangeEvent
Is this an issue? Are we comfortable with assuming Concurrent CDK will work without testing on these?
Notes:
tmp% grep " records from " salesforce_concurrent.jsonl
{"type": "LOG", "log": {"level": "INFO", "message": "Read 42 records from Account stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 1210 records from ActiveFeatureLicenseMetric stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 3254 records from ActivePermSetLicenseMetric stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 4536 records from ActiveProfileMetric stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 19 records from AppDefinition stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 25 records from Asset stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 399 records from FormulaFunctionAllowedType stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 1924 records from ObjectPermissions stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 5281 records from PermissionSetTabSetting stream"}}
{"type": "LOG", "log": {"level": "INFO", "message": "Read 3 records from LeadHistory stream"}}
tmp% grep " records from " salesforce_before.jsonl {"type": "LOG", "log": {"level": "INFO", "message": "Read 42 records from Account stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 1210 records from ActiveFeatureLicenseMetric stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 3254 records from ActivePermSetLicenseMetric stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 4536 records from ActiveProfileMetric stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 19 records from AppDefinition stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 25 records from Asset stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 399 records from FormulaFunctionAllowedType stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 1924 records from ObjectPermissions stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 5281 records from PermissionSetTabSetting stream"}} {"type": "LOG", "log": {"level": "INFO", "message": "Read 3 records from LeadHistory stream"}}
* Streams relying on bulk/jobs are sleeping. We could still end-up with all our workers sleeping so there would still be place for improvement but I don't think it should be a focus for us
Additional acceptance criteria that came up during grooming today:
Note that all the streams that are not queryable seem to be instantiated the same way (see SourceSalesforce.generate_streams
). The streams we have in our test environment cover only 3 of the 4 types of possible streams for salesforce i.e. IncrementalRestSalesforceStream, BulkSalesforceStream, BulkIncrementalSalesforceStream. RestSalesforceStream is not covered by either config.json
or config_bulk.json
. Hence we my lack a bit of visibility here
As for rate limiting, Salesforce use a rolling window of 24 hours that has a base number of allowed request depending if it is Developer Edition or Salesforce Edition. On top of that, Salesforce will increase the number of requests on the Salesforce Edition based on how many license you have and what type of license they are. For example, one "Customer Community Plus" and one "External Identity 25,000 SKU" will allow you to perform 70 200 requests on top of the 100 000 from the Salesforce Edition. source.
Random thoughts on rate limiting:
What
The concurrent CDK is developed using the source-stripe as a first connector. Stripe is a good first use case because it is straightforward.
We know we also want to speed up the salesforce connector, which is significantly more complicated.
I hypothesize that the structure of the concurrent CDK will allow the sales-force connector to leverage it, but we should derisk this with a PoC
How
We don’t need to implement custom partitions for the salesforce connector. It should be enough to use the legacy adapter and wrap the streams with
StreamFacade.create_from_legacy_stream
Acceptance criteria
Either: