databio / bbconf

Configuration package for bedbase project
https://pypi.org/project/bbconf/
BSD 2-Clause "Simplified" License
1 stars 2 forks source link

Add delete method (bed and bedset) #37

Closed khoroshevskyi closed 2 months ago

khoroshevskyi commented 6 months ago

At present, there is no functionality to delete a bed file and all its related files on S3, both in the bed and bedset.

nsheff commented 3 months ago

More details on how to do this:

https://github.com/databio/bedboss/issues/26#issuecomment-1948959955

I propose no new package/object, but do this:

use boto3 to upload from within bedboss. Then, insert metadata about this upload (when completed successfully) with report using pipestat, so the database knows what files were uploaded and where they are, or whatever (or maybe it's just True if the upload was successful, or something). I guess I could see this being a JSON blob with information about all files that were transferred to s3/b2.

Then, write a function or class that can remove an entry from the database. It would query the info from the main database, then

* if qdrant is present, remove it (and update the database), then

* if s3 files are present, remove them and update the database

* remove the entry from the pephub allbeds (bedboss output pep).

* finally, if those succeeded, remove the entry from the bedbase pipestat database.
khoroshevskyi commented 2 months ago

fixed in 0.5.0