Closed dyf closed 3 weeks ago
Only these buckets:
Dev account bucket: codeocean-s3datasetsbucket-eg0euwi4ez6z Prod account bucket: codeocean-s3datasetsbucket-1u41qdg42ur9
Have a job that runs on a schedule (lets start with every two hours) that scans the bucket, pulls information from the codeocean index using aind-codeocean-api and the service-account-token, builds a metadata record, and pushes that record to the DocDb index. It should check the DocDb index first to filter out stuff that has already been processed.
We can do this in a separate ECS container if its easy
The indexer needs to include assets that are processed results in the Code Ocean datasets bucket. This is needed so that science teams can analyze data as soon as it is processed, regardless of whether we are capturing it to an external bucket.
Acceptance Criteria