Open jordanpadams opened 6 months ago
@alexdunnjpl @tloubrieu-jpl any idea why this query is not working? We have noticed this with several attempts to run deep-archive have failed and produced incorrect data products. This has already been blocked on several occasions, and we thought we fixed those, so not sure what happened.
@jordanpadams do you have an example product (full url to document preferred), which should be appearing in this query?
@alexdunnjpl here is an opensearch query with the associated data products: https://search-geo-prod-6iz6lwiw6luyffpsq52ndsrtbu.us-west-2.es.amazonaws.com/_dashboards/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15y,to:now))&_a=(columns:!(lid,'ops:Tracking_Meta%2Fops:archive_status'),filters:!(),index:'04de9280-9067-11ed-aa4d-b9457fec4322',interval:auto,query:(language:kuery,query:'lid:urn%5C:nasa%5C:pds%5C:msl_gt_diagenesis_supplement%5C:data*'),sort:!())
Time | ops:Tracking_Meta/ops:archive_status | _id | |
---|---|---|---|
Mar 19, 2024 @ 07:51:22.348 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:veins::1.0 |
| Mar 19, 2024 @ 07:51:22.270 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:target_classification::1.0
| Mar 19, 2024 @ 07:51:22.170 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodule_rich_bedrock::1.0
| Mar 19, 2024 @ 07:51:22.070 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodules::1.0
| Mar 19, 2024 @ 07:51:21.985 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:local_rmsep_sigma20_win50_n20::1.0
| Mar 19, 2024 @ 07:51:21.947 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:dark_strata::1.0
| Mar 19, 2024 @ 07:51:21.862 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0
| Mar 19, 2024 @ 07:51:19.577 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data::1.1
| Nov 2, 2022 @ 10:55:10.771 | archived | urn:nasa:pds:msl_gt_diagenesis_supplement:data::1.0
https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodule_rich_bedrock::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:nodules::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:dark_strata::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:local_rmsep_sigma20_win50_n20::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:target_classification::1.0 https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:veins::1.0
@jordanpadams taking https://pds.nasa.gov/api/search/1/products/urn:nasa:pds:msl_gt_diagenesis_supplement:data:cements::1.0 as an example, there is no sweepers metadata present in the document.
Has sweepers been running on whichever OpenSearch node hosts the relevant product documents?
@alexdunnjpl I have no idea...
Plan to run local sweepers against GEO. Currently blocked by GEO node getting hammered by MCP migration.
possibly due to ReadErrors being encountered
@sjoshi-jpl is there any record available of if/when the geo-prod sweepers jobs started failing?
Initial assumption about load was incorrect - there is a block of very-large documents in GEO, resulting in some requests taking more than an order of magnitude longer than others, specifically in repairkit (which does not pull document subsets)
Currently resolving by dropping repairkit page size to 500 and increasing timeout to 180sec.
At 500/pg, maximum observed request time was 1m42s
Note to self - solved but not yet implemented, pending discussion with @tloubrieu-jpl
@alexdunnjpl @tloubrieu-jpl where are we at with this? It looks like this may be resolved?
@jordanpadams I need to loop back to it with @tloubrieu-jpl to decide on how we want to tweak the timeout parameters to resolve the issue.
I've run the sweepers locally to resolve the state of geo having no ancestry metadata and there's a good chance that sweepers are now running against GEO (because the massive docs are dealt with and no longer fetched by ancestry sweeper), but the root cause remains outstanding.
If it's important to close this out let me know - should be a quick thing, I've just been laser-focused on the migration and have been ignoring everything else.
@alexdunnjpl 👍 all good. just checking.
This going to be worked on after the migration to MCP is completed.
@alexdunnjpl @sjoshi-jpl we said we will work on that after the migration to MCP. Where are we with sweeper running on the nodes ? Thanks.
@tloubrieu-jpl this is probably a perfunctory close at this point once it's able to be retested - I'll defer to Sagar on status but it'll become clear once the sweeper is running on GEO.
Checked for duplicates
No - I haven't checked
🐛 Describe the bug
When I did a
members
query attempt on a collection, that should work, it does not.🕵️ Expected behavior
I expected it to work.
📜 To Reproduce
🖥 Environment Info
Chrome / MacOSx
📚 Version of Software Used
Latest deployed
🩺 Test Data / Additional context
No response
🦄 Related requirements
This is blocking:
⚙️ Engineering Details
No response
🎉 Integration & Test
No response