NASA-PDS / registry-sweepers

Scripts that run regularly on the registry database, to clean and consolidate information
Apache License 2.0
0 stars 1 forks source link

Malformed product docs break ancestry sweeper #20

Closed alexdunnjpl closed 1 year ago

alexdunnjpl commented 1 year ago

Checked for duplicates

No - I haven't checked

🐛 Describe the bug

Need to check whole sweeper and add error handling for malformed documents.

From Cloudwatch:

1
2023-05-31T11:58:24.132-07:00
KeyError: 'product_lidvid'
2
2023-05-31T11:58:24.132-07:00
nonaggregate_lidvids = [PdsLidVid.from_string(s) for s in doc["_source"]["product_lidvid"]]
3
2023-05-31T11:58:24.132-07:00
File "/usr/local/lib/python3.10/site-packages/pds/registrysweepers/ancestry/generation.py", line 137, in get_nonaggregate_ancestry_records
4
2023-05-31T11:58:24.132-07:00
nonaggregate_records = get_nonaggregate_ancestry_records(host, collection_records, registry_mock_query_f)
5
2023-05-31T11:58:24.132-07:00
File "/usr/local/lib/python3.10/site-packages/pds/registrysweepers/ancestry/__init__.py", line 47, in run
6
2023-05-31T11:58:24.132-07:00
run_ancestry(cross_cluster_remotes=cross_cluster_remotes)
7
2023-05-31T11:58:24.132-07:00
File "/usr/local/bin/sweepers_driver.py", line 100, in <module>
8
2023-05-31T11:58:24.132-07:00
Traceback (most recent call last):

🕵️ Expected behavior

I expected [...]

📜 To Reproduce

1. 2. 3. ...

🖥 Environment Info

📚 Version of Software Used

No response

🩺 Test Data / Additional context

No response

🦄 Related requirements

🦄 #xyz

⚙️ Engineering Details

No response