spring-projects / spring-data-elasticsearch

Provide support to increase developer productivity in Java when using Elasticsearch. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-elasticsearch/
Apache License 2.0
2.9k stars 1.33k forks source link

Unable to set _id in bulk index #2961

Closed naidusana closed 1 month ago

naidusana commented 1 month ago

I've tried to bulk index a bunch of JSON raw records into ES, and I needed to set custom _id values for them. Individual indexing works by calling "IndexQueryBuilder().withId(some_id_value)" and then calling the individual index method, but calling the "bulkIndex" method doesn't consider what was defined as the _id desired value.

Here's the code that ignores the ".withId" call:

package ;

import java.util.List; import java.util.Map; import java.util.stream.Collectors;

import org.springframework.beans.factory.annotation.Autowired; import org.springframework.data.elasticsearch.core.ElasticsearchOperations; import org.springframework.data.elasticsearch.core.IndexOperations; import org.springframework.data.elasticsearch.core.mapping.IndexCoordinates; import org.springframework.data.elasticsearch.core.query.IndexQuery; import org.springframework.data.elasticsearch.core.query.IndexQueryBuilder; import org.springframework.stereotype.Service;

@Service public class ESService {

@Autowired
private ElasticsearchOperations esOperations;

public void index(String baseName, Map<Integer, String> jsonDocuments, String indexName, Long exp_time) {

    IndexCoordinates indexCoordinates = IndexCoordinates.of(indexName);

    IndexOperations indexOps = esOperations.indexOps(indexCoordinates);
    if(!indexOps.exists()) {
        indexOps.create();
        try {
            Thread.sleep(1000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    List<IndexQuery> indexQueries = jsonDocuments.keySet().stream()
        .map(id -> new IndexQueryBuilder()
            .withSource(jsonDocuments.get(id))
            .withId(id.toString()) // HERE IS THE IGNORED CALL
            .withIndex(indexName)
            .build())
        .collect(Collectors.toList());

    try {
        esOperations.bulkIndex(indexQueries, indexCoordinates);
    } catch (Exception e) {
        e.printStackTrace();
    }
}

}

Springboot: 3.0.13 spring-data-elasticsearch - 5.0.12

sothawo commented 1 month ago

This was fixed in #2862, released in March 2024 with the following versions: 5.1.10, 5.2.4 and 5.3.0. The version you use is from November 2023. 5.0.x is out of maintenance now

naidusana commented 1 month ago

@sothawo the fix is working with 5.1.10. Thanks for your help!