Closed max-zilla closed 10 months ago
To test upload extracted metadata with array of json objects and search for keys in the objects.
Tested with
"@context": [
"https://clowder.ncsa.illinois.edu/contexts/metadata.jsonld",
{
"Predictions": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions",
"class_name": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.class_name",
"class_description": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.class_description",
"score": "http://clowder.ncsa.illinois.edu/metadata/ncsa.tensorflow-parallel-dataset-image-classification#Predictions.score"
}
],
"agent": {
"@type": "cat:extractor",
"name": "ncsa.tensorflow-parallel-dataset-image-classification",
"extractor_id": "https://clowder.ncsa.illinois.edu/clowder/extractors/ncsa.tensorflow-parallel-dataset-image-classification/2.3"
},
"content": {
"Predictions": [
{
"class_name": "n01682714",
"class_prediction": "American_chameleon",
"score": 0.7607384
},
{
"class_name": "n01693334",
"class_prediction": "green_lizard",
"score": 0.21042463
},
{
"class_name": "n01687978",
"class_prediction": "agama",
"score": 0.016864877
}
]
}
}```
Description
This is a proposed fix for a bug discovered in Clowder process for indexing extractor metadata into Elasticsearch. The previous code would inadvertently cast nested JSON objects as long JSON strings in some cases where arrays were being used, this PR modifies the indexer to retain the JSON structure. Features like ES type inference (double vs. string for example) is maintained.
Affected instances would need to do the following to refresh/correct the search index: POST /api/deleteindex POST /api/reindex (this does not delete the index first, must do it manually)
Review Time Estimate
Types of changes
Checklist: