isamplesorg / isamples_inabox

Provides functionality intermediate to a collection and central
0 stars 1 forks source link

Copy pre-calculated vocabulary terms for SESAR #367

Open datadavev opened 2 months ago

datadavev commented 2 months ago

The SESAR iSB instance deployed on henry has the computed values for the controlled vocabularies.

The task here is to update the records on the SESAR iSB instance operated by SESAR (Columbia) with the pre-calculated values to avoid the need for the very resource intensive re-calculation of those terms.

The basic process is:

  1. export the SESAR records from isamples central
  2. iterate through the exported records and for each record update the corresponding record on the Columbia SESAR iSB with the exported controlled vocabulary value

Afterwards, the Columbia SESAR iSB can use the model server API exposed by iSamples Central to get vocabulary terms for new or updated content being added to SESAR (throughput will need to be evaluated - it may be too slow).

dannymandel commented 2 months ago

I think really should just be:

curl  "https://central.isample.xyz/isamples_central/export/create?q=source:SESAR&export_format=jsonl"

And then following the rest of the documentation at https://github.com/isamplesorg/isamples_inabox/blob/develop/docs/export_service.md

dannymandel commented 2 months ago

Then you'll parse out the result fields for

"has_specimen_category", "has_material_category", "has_context_category"

and hook them back up to your existing records.