AllenNeuralDynamics / aind-data-asset-indexer

MIT License
0 stars 0 forks source link

Check if s3 prefix format matches expected regex #63

Closed helen-m-lin closed 3 months ago

helen-m-lin commented 4 months ago

User story

As a user, I want to see only metadata records from valid s3 prefixes according to a certain format, so that invalid folders are ignored.

Note that in the lambda function, invalid s3 prefixes are already ignored.

Acceptance criteria

A valid s3 prefix should be in format: {modality}_{id}_{acq_datetime}

Sprint Ready Checklist

Notes

We can check using DATA = f"^(?P<label>.+?)_(?P<c_date>{RegexParts.DATE.value})_(?P<c_time>{RegexParts.TIME.value})$" from aind-data-schema

helen-m-lin commented 3 months ago

~250 invalid prefixes in s3 (mostly in the aind-ophys-data) are filtered by the indexer (would not show up in docdb)