ACED-IDP / gen3_util

Collection of command line tools to interact with a Gen3 instance
MIT License
3 stars 1 forks source link

Support Bundle Submission DELETE commands on client/server for metadata #83

Closed matthewpeterkort closed 6 days ago

matthewpeterkort commented 5 months ago

The following is a design write-up for client and server features to be added to our etl stack to support RESTFUL Bundle style submission of DELETE commands.

G3t // Client

rm command: add support for creating a bundle / job when an rm command is executed that also deletes the corresponding document reference that is associated with the file in the bucket and the indexd record so that the rm command doesn't leave orphaned metadata

reset command: add support for a reset command that gathers all ids from metadata in the META/ directory and dumps them into DELETE's in bundle format

In addition, some older code that deletes indexd and bucket files from the project also should be implemented to fully clean the project from the gen3 instance. This has already been written in previous g3t versions

are you sure prompt -- whenever a user generates a bundle and are about to push the bundle to the server add an extra are you sure prompt so that they don't accidentally delete data from the server that they didn't intend to delete

etl_pod // Submission

bulk_delete_record_by_id some sort of bulk record deleting function in both Postgres peregrine and Elasticsearch that accepts a project_id and a list of ids and deletes all records in the list that are also part of that project

parse_bundle function logic that checks for each commit each "bundle.ndjson" and parses out the DELETE ids into a list that can be passed to the bulk delete submission functions

NOTE: the .ndjson extension is searched for in later data loading functions, and the file should either be renamed or removed from the working directory on the server after the bundle logic / functions are executed so that errors when reading files from that same directory to add data to the DBs are avoided since the bundle is not a regular fhir NDSON that is expected.

matthewpeterkort commented 6 days ago

This has been implemented here https://github.com/ACED-IDP/gen3_util/pull/94