spring-projects / spring-data-elasticsearch

Provide support to increase developer productivity in Java when using Elasticsearch. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-elasticsearch/
Apache License 2.0
2.9k stars 1.33k forks source link

Spring Data ElasticSearch: Issue with Concurrent Requests and Dynamic Index Setting #2954

Closed Rajkumar-Ramesh closed 1 month ago

Rajkumar-Ramesh commented 1 month ago

Description:

We have a model called Activity, and the index name is dynamically set based on applicationId. In a Java multi-threaded environment, when concurrent requests come in, the index name can change during the execution of the first request, causing inconsistencies. Specifically, if two requests are processed simultaneously, the second request may alter the index name before completing the first request, leading to incorrect data being fetched or stored.

@Data
@Document(indexName = "#{@dynamicAliasConfiguration.getActivityAlias()}", createIndex = false, writeTypeHint = WriteTypeHint.FALSE)
public class Activity {
    @Id private String uniqueIdentifier;
    private String applicationId;
    private String componentId;
    private String accountId;
}

public interface ActivityRepository extends ElasticsearchRepository<Activity, String> {
    @Query("{\"ids\": {\"values\": [\"?0\"] }}")
    Optional<List<Activity>> getById(String id);
}

@Component
@Data
public class DynamicAliasConfiguration {
    private String activityAlias;
}

@RestController
@RequestMapping("/v3/activity")
@RequiredArgsConstructor
public class ActivityQueryController {
    private final DynamicAliasConfiguration dynamicAliasConfiguration;
    private final ActivityRepository activityRepository;

    @GetMapping("/id/{id}")
    public Activity getById(@PathVariable String id, @NotEmpty @RequestParam String applicationId) {
        dynamicAliasConfiguration.setActivityAlias(ActivityIndexAliasProvider.getActivityReadAlias(applicationId));
        return activityRepository.getById(id).orElseThrow(() -> new NotFoundException("Activity not found with id :: " + id));
    }
}

plugins {
    id 'java'
    id 'org.springframework.boot' version '3.2.5'
    id 'io.spring.dependency-management' version '1.1.4'
}

java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(17)
    }
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-data-elasticsearch'
    implementation 'org.springframework.boot:spring-boot-starter-web'
}

Attempted Solution:

I tried modifying the repository method to accept IndexCoordinates, but it did not resolve the issue:

@Query("{\"ids\": {\"values\": [\"?0\"] }}")
Optional<List<Activity>> getById(String id, IndexCoordinates indexCoordinates);

@GetMapping("/id/{id}")
public Activity getById(@PathVariable String id, @NotEmpty @RequestParam String applicationId) {
    return activityRepository.getById(id, ActivityIndexAliasProvider.getActivityReadAlias(applicationId))
            .orElseThrow(() -> new NotFoundException("Activity not found with id :: " + id));
}

Is there a recommended approach for handling dynamic index names in a multi-threaded environment to avoid such conflicts? Any guidance or best practices would be greatly appreciated.

sothawo commented 1 month ago

Your problem is out of scope for Spring Data Elasticsearch. You have a Spring bean, the @Component DynamicAliasConfiguration, this exists exactly once in your application context. And when manipulating this one instance from multiple threads - incoming requests - this will lead to errors as you already learned.

You should be able to solve this problem by properly scoping that bean (see https://docs.spring.io/spring-framework/reference/core/beans/factory-scopes.html), request or session scope should be the right one in your setup