Legacy Registry Software components leveraging Apache Solr. Includes Legacy Harvest Tool, Registry Manager, PDS3 Catalog Tool, and Search Core library. These components provide the capabilities for loading PDS3 and PDS4 data into the Legacy Solr Registry, driving the PDS keyword search.
Apache License 2.0
0
stars
1
forks
source link
As a user, I want to sync ESA PSA products from the Search API #135
...so that I can have the ESA PSA products available through the Solr search
š Additional Details
I think the easiest route to do this is to have a separate Python script to download the XML to a local archive, and then execute harvest to load it into the legacy registry
Acceptance Criteria
Given a Registry Search API loaded with ESA PSA context products
When I performpds-sync-api --node-name psa --download-path /path/to/download/XMLThen I expect a python script to download the XML files to --download-path (if they do not already exist)
āļø Engineering Details
query Search API for all PSA context products, bundles, collections
paginate through the results
check if the LIDVID has already been loaded into the Registry or not
if not, check if the XML is already in --download-path (using file name and ops:Label_File_Info.ops:md5_checksum)
if the file does not exist, download to --download-path
execute harvest on those XML
This will be a two-part ticket since we will then need a bash script to be added to this repo to actually execute harvest on the downloaded data.
@nutjob4life not sure of the best place to put this script. it can either go somewhere here (with a requirements.txt), our operations repo (which contains a bunch of ad-hoc scripts), or ?
Checked for duplicates
Yes - I've already checked
š§āš¬ User Persona(s)
Data User
šŖ Motivation
...so that I can have the ESA PSA products available through the Solr search
š Additional Details
I think the easiest route to do this is to have a separate Python script to download the XML to a local archive, and then execute harvest to load it into the legacy registry
Acceptance Criteria
Given a Registry Search API loaded with ESA PSA context products When I perform
pds-sync-api --node-name psa --download-path /path/to/download/XML
Then I expect a python script to download the XML files to--download-path
(if they do not already exist)āļø Engineering Details
--download-path
(using file name andops:Label_File_Info.ops:md5_checksum
)--download-path
This will be a two-part ticket since we will then need a bash script to be added to this repo to actually execute harvest on the downloaded data.
š I&T
No response