Closed tloubrieu-jpl closed 9 months ago
@alexdunnjpl @al-niessner @sjoshi-jpl , I implemented the code to copy from the legacy registry on SOLR into a new index legacy_registry
on opensearch.
I would like that script to run only on the EN_PROD domain. Do you think that is ok to add a if
statement in the sweeper_driver there https://github.com/NASA-PDS/registry-sweepers/blob/1a92b530c9e7a29d0b79e7afbfbc559dae4f3d0c/docker/sweepers_driver.py#L110 or do you have another idea ?
Thanks
@tloubrieu-jpl if this is a temporary/non-long-term task, I'd recommend running it separately (i.e. separate task and schedule) rather than as part of registry-sweepers. Is this even something that needs to be run periodically (and is it coded in such a way that it won't burn a bunch of compute time unnecessarily)?
If it should be part of sweepers, I'd suggest implementing an argparser option --enable-legacy-solr-import-sweeper
or similar, defaulting to False
, which is used as the condition for an if
block running that sweeper, in the driver. That way @sjoshi-jpl can just add that option to the invoked docker command for the en-prod task definition.
Thanks @alexdunnjpl , I guess it is a temporary task that might last for ne or 2 years.
That should be run periodically but not as often as the other registry-sweeper tasks maybe.
I will add the option as you suggested.
@tloubrieu-jpl in this context, a couple of years is plenty to consider it non-temporary.
Given what you've said about it not running as often as other sweeper tasks, I'd suggest instead creating a second driver script (copy the existing one and give it a more-specific name) which is just for running your solr legacy script.
There is no way to decouple the cadences while using the same driver script.
You would need to double-check the Dockerfile to ensure that the script is copied into the image (probably have it copy all *.py in that directory, for future-proofing).
If we need extra configurability and/or prefer use of a single driver script, we could have all individual sweepers be opt-in via CLI flags, though existing task definitions for provenance/ancestry would need to be updated to use them.
@tloubrieu-jpl here is the ECS Schedule override we spoke about. We can test it after you're done making chagnes.
Schedule: EN-PROD Overrides:
{
"containerOverrides": [{
"name":"pds-en-prod-registry-sweeper-container",
"command":["--enable-legacy-solr-import-sweeper"]
}]
}
💡 Description