DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Support for HCA duos_id #6196

Open dsotirho-ucsc opened 5 months ago

dsotirho-ucsc commented 5 months ago

Factored out of #6131 due to the fact that current metadata does not contain this field as of today.

Index project.duos_id, expose in all /index/… responses.

achave11-ucsc commented 5 months ago

Assignee to periodically check for availability of duos_id in snapshots for pilot catalog, or dcpXY catalogs once manage access snapshots are available in prod.

bvizzier-ucsc commented 1 week ago

@dsotirho-ucsc When was the last time this was checked?

Let me know if it isn't there and I'll escalate it.

dsotirho-ucsc commented 1 week ago

@bvizzier-ucsc This was last checked for in dcp42, but I somehow missed the managed access snapshot. Just re-checked and confirmed the duos_id field in project 35d5b057's snapshot.

achave11-ucsc commented 1 week ago

@hannes-ucsc: "Spike to confirm with CC that it is sufficient to return this property in the hits part of the response (no filtering, no sorting, no faceting)"

dsotirho-ucsc commented 1 week ago

Received confirmation from CC that adding the duos_id field to the hits part of the response would be sufficient.

nadove-ucsc commented 1 week ago

@hannes-ucsc: "At least for MA snapshots, the HCA metadata project entity has a DUOS Id property that we will use. For non-MA projects the DUOS Id may be absent, but getting it from TDR as we do for AnVIL would be much more complicated, as it would require a special bundle type, for which we have no support in HCA."