NASA-PDS / registry-legacy-solr

Legacy Registry Software components leveraging Apache Solr. Includes Legacy Harvest Tool, Registry Manager, PDS3 Catalog Tool, and Search Core library. These components provide the capabilities for loading PDS3 and PDS4 data into the Legacy Solr Registry, driving the PDS keyword search.
Apache License 2.0
0 stars 1 forks source link

Catalog does not create field 'package_id' #96

Closed c-suh closed 1 year ago

c-suh commented 1 year ago

Checked for duplicates

Yes - I've already checked

🐛 Describe the bug

When I run registry-mgr on the output solr XML, there is an error because each doc does not have a package_id field

🕵️ Expected behavior

I expected each doc to have a package_id so that registry-mgr is able to index PDS3 data (similar to what harvest does for PDS4 data)

📜 To Reproduce

  1. download PDS3 data, e.g. JNOJNC_0024 data
  2. run ./catalog --mode ingest --doc-config $CATALOG_HOME/search-conf/defaults/ --output-dir $REGISTRY_DATA_HOME/pds3/solr-docs/ --report-file $REGISTRY_DATA_HOME/pds3/log/JNOJNC_0024.log --target $REGISTRY_DATA_HOME/pds3/JNOJNC_0024
  3. go to where registry-mgr-legacy is deployed
  4. run ./registry-mgr <catalog's output-dir>

📚 Version of Software Used

2.1.0-SNAPSHOT

nutjob4life commented 1 year ago

@c-suh @jordanpadams: what's the format of the package_id? Is it provided on the command-line? Is it derived from the data set ID? Can I use a random UUID?

c-suh commented 1 year ago

@nutjob4life I looked at the solr doc xml files created by harvest, and the package_id fields all had the same UUID. If I recall correctly, Jordan said that they are an identifier for that ingestion batch.

nutjob4life commented 1 year ago

@c-suh ah, okay … so if they're per-batch, that suggests I can use an ad hoc UUID or provide a command-line option for the user to specify one.

nutjob4life commented 1 year ago

From tag-up 2023-09-26: the package ID should be generated by the tool, not specified on the command-line

nutjob4life commented 1 year ago

Undoing all changes to

And making just one change to src/main/java/gov/nasa/pds/citool/search/DocWriter.java

nutjob4life commented 1 year ago

Also suppressing stack traces from Solr communication in RegistryClientSolr.java.