Closed forus closed 6 months ago
Bulk Load Functionality with Overwrite Option: Evaluate the necessity of disabling the bulkLoad function when the --overwrite-existing
flag is on. Since record removal occurs directly anyway.
Incremental Load Adjustments for extra mutations data: Determine how incremental load should work, when data such as gene panels, filtered mutations, namespaces, or Swiss-Prot information is included in mutation uploads.
Transaction Boundary Modifications: Reassess and possibly redefine transaction boundaries for incremental uploads to enhance efficiency and integrity. Note that these boundaries have remained unmodified in the current Proof of Concept (POC).
POC Expansion: Extend the POC to incorporate additional data types for a more comprehensive analysis:
What is done in this PR:
Enhancements to the DAO Layer: Introduced new methods in the Data Access Object (DAO) layer to enable:
Reversion of Method Deprecations: Reinstated previously deprecated methods for retrieving mutations.
Data Import Extension:
--overwrite-existing
flag for theImportClinicalData
andImportProfileData
scripts, allowing for the re-upload of entries if they already exist in the database.Python scripts Extension
Testing Enhancements: Added integration tests to verify the incremental upload functionality for sample and mutation data.
Demo (you can download high-quality video here):
study_es_0_inc
folder data description (the green = new entries; the yellow = updated entries; the light blue = existing in db):