AllenNeuralDynamics / aind-data-asset-indexer

MIT License
0 stars 0 forks source link

Update S3 based on DocDB #40

Closed helen-m-lin closed 4 months ago

helen-m-lin commented 5 months ago

User story

This is part of the AIND Metadata Update POC. As a service admin, I want to user updates to metadata in DocDB to also be reflected in S3, so I can ensure data is in sync.

The new AIND metadata update process will allow users to update metadata directly to DocDB. We can leverage existing AIND Data Asset Indexer to make the appropriate downstream changes to S3.

Acceptance criteria

Sprint Ready Checklist

Notes

The new DocDB to S3 workflow should only run in Dev as part of the POC.

mekhlakapoor commented 4 months ago
#check how to retrieve changes from docdb
cursor = collection.watch(full_document='updateLookup’)
for change in cursor:
    updated_record = change['fullDocument']
    s3_key = f'{updated_record["_id"]}.json'

    # Update corresponding data in S3
    s3_client.put_object(Bucket=bucket_name, Key=s3_key, Body=json.dumps(updated_record))

Do a check where you update a record in mongo compass and see if the cursor change looks as we expected. Otherwise, you can change the code as needed but here's a basic idea of how we can do this.