Open yash025 opened 1 year ago
This is similar to https://github.com/opensearch-project/opensearch-java/issues/297 and there's some hints on how to do that in https://github.com/opensearch-project/opensearch-java/issues/297#issuecomment-1362157933. I don't have working code to share though, let's try and work through it? Maybe @owaiskazi19 or @Xtansia have an example?
Elasticsearch-java has since added a withJson
method, so I think we ultimately do want to write new IndexRequest.Builder().id("1").index(index).withJson(json).build
.
Finally, maybe make a sample ala https://github.com/dblock/opensearch-java-client-demo in Scala, so we have something to start with?
Thanks @dblock for the quick response, I tried based on the hints, and I could add a document and query it.
Here's a demo project in Scala: https://github.com/yash025/opensearch-scala-demo
Awesome, thanks @yash025. So you have a document type and you serialize it to JSON with CirceToJava
? Originally you wanted to make raw JSON work, I think we'd still be interested in that. Same for Java.
Based on this test case https://github.com/opensearch-project/opensearch-java/blob/main/java-client/src/test/java/org/opensearch/client/opensearch/json/JsonDataTest.java#L51-L63 I think the correct way to get from a JSON string to a JsonData
(in Java) is something like:
JsonpMapper mapper = openSearchClient._transport().jsonpMapper();
JsonParser parser = mapper.jsonProvider().createParser(new StringReader(jsonString));
JsonData data = JsonData.from(parser, mapper);
I think this is a feature request to add json
(or withJson
) everywhere we support document
. Anyone wants to give it a try?
@dblock I will try to squeeze in some time this week and try this.
@dblock Question, What should be the TDocument for withJson? Should we create some class similar to CirceToJava something like RawJson and whenever someone wants to use raw json as the document, then that will be TDocument I mean it'll be the type of IndexRequest(IndexRequest[RawJson])? I don't see much use of withJson in the java world, and I did check withJson in the latest version of elastic search java client it won't work for complex json they've written some simple JSON mapper which will try to deserialize that back to TDocument ignoring unknown fields, for complex json(multi nested) user need to specify the parser and mapper explicitly.
@yash025 I am not sure, but I'm thinking really from the POV of a developer who has a bunch of documents/queries and just wants to make them, without stuffing the JSON into well defined structures. This is particularly useful in IndexRequest.Builder().id("1").index(index).withJson(json).build()
because the document being indexed can really be any JSON, and similarly would be useful in search, but I agree that it's probably not more useful than that. In such I think your suggestion works!
Hey @yash025! You can refer https://github.com/opensearch-project/opensearch-java/issues/257 which has the sample code in Java to create an index. Currently, we don't have withJson
support but you can pass the mapping file for the index similar to:
private String getAnomalyDetectorMappings() throws IOException {
URL url = AnomalyDetectionIndices.class.getClassLoader().getResource(ANOMALY_DETECTORS_INDEX_MAPPING_FILE);
return Resources.toString(url, Charsets.UTF_8);
}
Hi, @owaiskazi19 thanks, I've found a workaround mentioned in the above comment, that works for me.
@yash025 I am not sure, but I'm thinking really from the POV of a developer who has a bunch of documents/queries and just wants to make them, without stuffing the JSON into well defined structures. This is particularly useful in
IndexRequest.Builder().id("1").index(index).withJson(json).build()
because the document being indexed can really be any JSON, and similarly would be useful in search, but I agree that it's probably not more useful than that. In such I think your suggestion works!
@dblock so should I go and add a class similar to CirceToJava something like RawJSON, and whoever wants to use raw json they should use that class
for eg:
IndexRequest.Builder[RawJSON]().id("1").index(index).document(new RawJSON().withJsonStr(<jsonString>)).build()
I think document(new RawJSON().withJsonStr(<jsonString>))
is really ugly and should be wrapped as jsonDocument()
or .withJson
instead of .document
. WDYT?
Yes, that would look nicer, but we need to provide .withJson()
or jsonDocument()
only when the IndexRequest is of type RawJSON, right?
IndexRequest[<any other class object>].index(index).jsonDocument()
, how to handle this?
🤔 @yash025 I am not sure. Give it a try? Let's look at code?
What is the bug?
When we push the document as a JSON string using IndexRequest, the API fails with the below error the same works if we pass the document as a java map. Caused by: org.opensearch.client.opensearch._types.OpenSearchException: Request failed: [mapper_parsing_exception] failed to parse at org.opensearch.client.transport.aws.AwsSdk2Transport.parseResponse(AwsSdk2Transport.java:530) at org.opensearch.client.transport.aws.AwsSdk2Transport.executeSync(AwsSdk2Transport.java:438) at org.opensearch.client.transport.aws.AwsSdk2Transport.performRequest(AwsSdk2Transport.java:241) at org.opensearch.client.opensearch.OpenSearchClient.index(OpenSearchClient.java:764)
Are there any working examples where JSON string is pushed as a document instead of java POJOs?
I'm trying this in Scala.
Below code works, where the document is passed as java map to the same index