SFDO-Tooling / CumulusCI

Python framework for building portable automation for Salesforce projects
http://cumulusci.readthedocs.io
BSD 3-Clause "New" or "Revised" License
363 stars 242 forks source link

Polymorphic Lookup Data Extraction #3741

Closed aditya-balachander closed 9 months ago

aditya-balachander commented 9 months ago

Epic: 250 - Polymorphic data extraction

Polymorphic Lookups are references that can point to multiple SObjects. Consider the Event SObject, where the WhoId field is a reference to both Contact and Lead. Previously, CumulusCI didn't handle these polymorphic lookup fields during extraction and loading. The goal of the Epic was to include this support in CumulusCI.

Old Format: (mapping.yml)

Insert Contact:
    sf_object: Contact
    api: rest
    fields:
        - LastName
    lookups:
        AccountId: Account

New Format: (mapping.yml)

Insert Event:
    sf_object: Event
    api: rest
    fields:
        - Subject
    lookups:
        WhoId:
            - Contact
            - Lead

The Epic spanned over multiple Work Items:

  1. W-14692643 Identify the reference entity
  2. W-14692643 Store the reference entity name in dataset
  3. W-14736696 Resolve the reference while loading the data

To accommodate polymorphic lookups for the data extraction and data loading, format of Id’s is changed while extraction. While loading to allow polymorphic fields multiple sf_id table were converted into single sf_id table. That is after the insertion, the sf_ids are queried for the lookups and all these entries are stored in a single sf_id table named cumulusci_id_table. So, in the processing it is again a single to outer join for the polymorphic lookup field.

File Name Test Method Name Description
cumulusci/tasks/bulkdata/tests/test_load.py test_query_db__joins_polymorphic_lookups Test if correct query is being generated for polymorphic lookup fields
cumulusci/tasks/bulkdata/tests/test_load.py test_generate_contact_id_map_for_person_accounts Test if correct id field is generated for load with both old format and new format
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_infer_and_validate_lookups__table_doesnt_exist Test that during load, the mapping file contains all the sobjects mentioned in the lookups
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_infer_and_validate_lookups__incorrect_order Test that during load, the mapping file has the sobjects in the correct order such that an sobject mentioned in lookup does not come before the sobject is mentioned directly
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_infer_and_validate_lookups__after Test that during load, after steps are getting generated correctly for lookups
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_infer_and_validate_lookups__invalid_reference Test that during load, the sobject mentioned in the lookup is the actual reference sobject as per org schema
  1. W-14736698 Polymorphic data extraction and testing
File Name Test Method Name Description
cumulusci/tasks/bulkdata/tests/test_extract.py test_run__poly__polymorphic_lookups Test Extract for polymorphic lookups (mapping file: cumulusci/tasks/bulkdata/tests/mapping_poly.yml)
cumulusci/tasks/bulkdata/tests/test_extract.py test_run__poly__wrong_mapping Test Extract throws an error for a mapping file with a missing table (mapping file: cumulusci/tasks/bulkdata/tests/mapping_poly_wrong.yml)
cumulusci/tasks/bulkdata/tests/test_extract.py test_run__poly__incomplete_mapping Test Extract throws an error for a mapping file with an incomplete number of reference tables. For example, if the org has both Contact and Lead records and you only specify Contact in your mapping file, it throws an error saying to mention all references for the field (mapping file: cumulusci/tasks/bulkdata/tests/mapping_poly_incomplete.yml)
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_get_extract_field_list Test that the fields come in the correct order, i.e., Id first, then the other fields, then the lookup fields for the get_extract_field_list function
cumulusci/tasks/bulkdata/tests/test_utils.py test_create_table_modern_id_mapping Assert that the local id column of the sobject table is of type Unicode instead of Integer (previous)
  1. W-14863785 Support for polymorphic lookups during generation of mapping file
File Name Test Method Name Description
cumulusci/core/tests/test_datasets_e2e.py test_datasets_read_explicit_extract_declaration Given extract file and loading rules with polymorphic lookups, see if the correct mapping file is generated
cumulusci/tasks/bulkdata/extract_dataset_utils/tests/test_synthesize_extract_declarations.py test_required_lookups__pulled_in__polymorphic_lookups Given extract file with only sobject Event and field WhoId, see if Contact and Lead sobjects (which are references) are pulled in before extraction or load
cumulusci/tasks/bulkdata/generate_mapping_utils/tests/test_generate_extract_mapping_from_declarations.py test_generate_mapping_from_declarations__polymorphic_lookups Given the ExtractDeclaration for sobjects with polymorphic lookups, see if the correct extract mapping file is getting generated
cumulusci/tasks/bulkdata/generate_mapping_utils/tests/test_generate_load_mapping_from_declarations.py test_generate_load_mapping_from_declarations__polymorphic_lookups Given the ExtractDeclaration for sobjects with polymorphic lookups, see if the correct load mapping file is getting generated
cumulusci/tasks/bulkdata/generate_mapping_utils/tests/test_mapping_generator_post_processes.py test_add_after_statements__polymorphic_lookups Given multiple MappingStep with sobjects having polymorphic lookups, see if the correct after statement is getting generated. The after statement for polymorphic fields should be after the last reference sobject
cumulusci/tasks/bulkdata/tests/test_generatemapping.py test_build_mapping__polymorphic_lookups Given an org schema with polymorphic lookup fields, see if the correct mapping file is generated using the GenerateMapping class.
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_validate_and_inject_mapping_removes_lookups_with_drop_missing__polymorphic_partial_present Ensure that if polymorphic lookups with only partially present tables are there, it does not remove the lookup. For example, in your mapping file, you include Contact but not Lead and you mention the WhoId field (of sobject Event) with reference to both Contact and Lead, ensure that it does not remove the lookup field
cumulusci/tasks/bulkdata/tests/test_mapping_parser.py test_validate_and_inject_mapping_removes_lookups_with_drop_missing__polymorphic_none_present Ensure that if polymorphic lookups with none of the tables are there, it removes the lookup. For example, in your mapping file, you don't have both Contact and Lead and you mention the WhoId field (of sobject Event) with reference to both Contact and Lead, ensure that it removes the lookup field.
  1. W-14863821 Volume Testing of Polymorphic lookups

Quip Document

Test Result
Org with 15K Accounts, 11K Case records and 20K Event record in polymorphic relationship with Account and Case on WhatId field Time for Extract: 30s. Time for Load: 470s
Tested against multiple polymorphic fields that is for Event when it contains both WhoId and whatId. Where WhoId is linked with contact and lead, WhatId references to account and case. Org contains 45k records of Account, Contact, Case, Lead and Event jointly. Time for Extract: 48s. Time for Load: 275