NASA-IMPACT / veda-data-pipelines

data transformation - ingestion - publication pipelines to support VEDA
Other
12 stars 6 forks source link

nightlights-hd-3bands item indexing question #136

Closed anayeaye closed 2 years ago

anayeaye commented 2 years ago

What

There seems to be an indexing problem for the nightlights-hd-3bands items--I noticed that the collection summaries object is empty and when I tried to compute it manually I found that I couldn't select any items for this collection id in postgres. This is super strange because the simple get items search does work https://staging-stac.delta-backend.xyz/collections/nightlights-hd-3bands/items.

Notes

I can find the collection by id but can't find items by collection_id or the collection property in the "contents" (which is not the way to search but I am having trouble selecting the 3 band nightlights items in postgres).

select * from collections where id = 'nightlights-hd-3bands'; # 1 collection
select * from items where collection_id = 'nightlights-hd-3bands'; # no results
select * from items where "content"->>'collection' = 'nightlights-hd-3bands'; # no results

AC

Items indexed and update_default_summaries(nightlights-hd-3bands) produces a collection summary.

anayeaye commented 2 years ago

@abarciauskas-bgse @slesaad @xhagrg It turns out this is happening for multiple collections. I did a quick staging database comparison of all collection ids to the distinct list of collection_ids in the items table (here's a spreadsheet). I there might be a couple collection records we want to delete but I don't know why others are not represented in the items table. Can you all have a look and let me know what you think?

Two things seem possible

Collection ids not in the items table

blue-tarp-planetscope
grdi-v2-raster
HLSS30
IS2SITMOGR4
nightlights-hd-3bands
OMI_trno2
OMSO2PCA
OMSO2PCA_COG
anayeaye commented 2 years ago

I was able to run the summary function for all collections this AM. Issue resolved. Thanks!