AllenNeuralDynamics / aind-data-asset-indexer

MIT License
0 stars 0 forks source link

Nonexistent assets appearing in docdb #31

Closed dyf closed 4 months ago

dyf commented 5 months ago

Describe the bug The SmartSPIM report shows the following records: image

These were for runs that have since been deleted from S3, and as far as I can tell they also do not exist in Code Ocean.

To Reproduce Steps to reproduce the behavior:

  1. Go to the SmartSPIM dashboard
  2. Filter to subject_id 695464.
  3. See all the zombie assets.

Expected behavior Only real data should appear in docdb :).

jtyoung84 commented 4 months ago

This should be fixed. I ran that query and it returned the following results:

SmartSPIM_695464_2023-10-18_20-30-30_stitched_2023-11-01_00-47-53 20ea225b-00d8-4dd1-bee2-9aa7d148d9c2
SmartSPIM_695464_2023-10-18_20-30-30 450b3234-f48e-4c5d-9c56-981a62d9b191
SmartSPIM_695464_2023-10-18_20-30-30_test_dataset 1c9dcb3b-f83d-4e19-b94d-8db543183f9f
SmartSPIM_695464_2023-10-18_20-30-30_stitched_2024-02-02_09-34-53 defb5ced-66d7-41be-8fa0-623654dc0f3e
SmartSPIM_695464_2023-10-18_20-30-30_stitched_2024-01-10_12-49-29 e3d7e89f-b1f6-43c3-bb69-cb8bcdde3e00

The test_dataset is the only one that didn't show up via the Code Ocean UI, but it is returned via the API as:

 {'created': 1703876825, 'description': '', 'files': 25096, 'id': '1c9dcb3b-f83d-4e19-b94d-8db543183f9f', 'last_used': 0, 'name': 'SmartSPIM_695464_2023-10-18_20-30-30_test_dataset', 'size': 35504676572, 'sourceBucket': {'bucket': 'aind-msma-morphology-data', 'origin': 'aws', 'prefix': 'test_data/SmartSPIM/SmartSPIM_695464_2023-10-18_20-30-30/'}, 'state': 'ready', 'tags': ['smartspim', 'test', 'raw'], 'type': 'dataset'}