NASA-PDS / harvest

Standalone Harvest client application providing the functionality for capturing and indexing product metadata into the PDS Registry system (https://github.com/nasa-pds/registry).
https://nasa-pds.github.io/registry
Other
4 stars 3 forks source link

I want to update the OpenSearch schema whatever the number of fields to be updated #190

Open tloubrieu-jpl opened 3 hours ago

tloubrieu-jpl commented 3 hours ago

Checked for duplicates

No - I haven't checked

πŸ› Describe the bug

From Dan Scholes (GEO node) When I did harvest one bundle I got the error message:

[INFO] Updating Elasticsearch schema.
[ERROR] Request failed: [illegal_argument_exception] Limit of total fields [1000] has been exceeded

After this error the bundle that I am trying to load to the registry is not loaded, see log:

[SUMMARY] Reading configuration from E:\opt\configs24b\urn-nasa-pds-a17fuvs.xml
[SUMMARY] Output directory: /tmp/harvest/out

Except for the first time it tried:

[ERROR] Request failed: [illegal_argument_exception] Limit of total fields [1000] has been exceeded
[INFO] Wrote 4454 product(s)
[SUMMARY] Summary:
[SUMMARY] Skipped files: 0
[SUMMARY] Loaded files: 4454
[SUMMARY]   Product_Bundle: 1
[SUMMARY]   Product_Collection: 6
[SUMMARY]   Product_Observational: 4447
[SUMMARY] Failed files: 4121
[SUMMARY] Package ID: 106a3013-01d8-4cad-a71c-76ec8f250099
[SUMMARY] Reading configuration from E:\opt\configs24b\clem1-gravity-topo-v1.xml
[SUMMARY] Output directory: /tmp/harvest/out

Full log: processingLogsBatch2.txt.zip

πŸ•΅οΈ Expected behavior

I expected not to have limitation in the updade of the Opensearch schema when I use harvest.

πŸ“œ To Reproduce

No response

πŸ–₯ Environment Info

πŸ“š Version of Software Used

4.0.1

🩺 Test Data / Additional context

No response

πŸ¦„ Related requirements

πŸ¦„ #xyz

βš™οΈ Engineering Details

No response

πŸŽ‰ Integration & Test

No response

jordanpadams commented 3 hours ago

@tloubrieu-jpl is there a workaround for this or is this a critical bug?

al-niessner commented 2 hours ago

@tloubrieu-jpl

Can I get the bundle and harvest config file? How big (bytes) is it including folders etc?

Searched the java code (harvest, registry-mgr, and registry-common) and there is no 'total fields'. At this time, it means it is a serverless limit not ours. We may need to introduce paging of writes were it was not needed previously.