spring-projects / spring-data-mongodb

Provides support to increase developer productivity in Java when using MongoDB. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-mongodb/
Apache License 2.0
1.61k stars 1.08k forks source link

Issue referencing or using field value set with SetWindowFieldsOperation shift #4745

Closed alex-ionescu-teamextension closed 1 week ago

alex-ionescu-teamextension commented 1 month ago

Hello,

I am using spring-data-mongodb-4.2.5.jar and I have the usecase where I am doing a shift to get the next or previous value of a timestamp field using $setWindowFields. The problem with the Java approach is that the resulting field is not correctly referenced in the aggregation pipeline, resulting in a null value when doing a $dateDiff.

This is how I am doing the $setWindowFields / $dateDiff:

SetWindowFieldsOperation timestampWindowFields = SetWindowFieldsOperation.builder()
                .partitionByField("metaData.deviceId")
                .sortBy(Sort.by(Sort.Direction.ASC, "timestamp"))
                .output(DocumentOperators.valueOf("timestamp").shift(-1).defaultTo(-1))
                .as("previous")
                .build();
DateOperators.DateDiff dateDiff = DateOperators.zonedDateOf("timestamp", DateOperators.Timezone.valueOf("America/Chicago"))
                .diffValueOf("previous", DateOperators.TemporalUnit.from(ChronoUnit.SECONDS));
        SetOperation dateDiffOperation = set("timeDifference").toValue(dateDiff);

And this is how the operator ends up in the aggregation pipeline output: image

Any ideea why the "endDate" field is referenced like this, is there a way to overcome this issue to get the correct result? How could I do a dateDiff between computed / shifted fields?

Thanks in advance, Alex

christophstrobl commented 1 month ago

thank you @alex-ionescu-teamextension for getting in touch. Please do not use images but take the time to provide a complete minimal sample (something that we can unzip or git clone, build, and deploy) that reproduces the problem.

alex-ionescu-teamextension commented 1 month ago

Of course @christophstrobl ,

Given the below documents in a collection named "telemetry-debug":

[{
  "timestamp": {
    "$date": "2024-05-29T03:25:15.511Z"
  },
  "metaData": {
    "deviceId": "7FCTGAAA9PN023984"
  },
  "_id": {
    "$oid": "669639886b2b5d28d51866c3"
  }
},
{
  "timestamp": {
    "$date": "2024-05-29T03:25:15.651Z"
  },
  "metaData": {
    "deviceId": "7FCTGAAA9PN023984"
  },
  "_id": {
    "$oid": "669639886b2b5d28d51866d8"
  }
}]

Then running the following aggregation should correctly populate the "timeDifference" field for the second document, given that both "timestamp" and "previous" are available:

@Test
    public void testDateDiff() {
        SetWindowFieldsOperation timestampWindowFields = SetWindowFieldsOperation.builder()
                .partitionByField("metaData.deviceId")
                .sortBy(Sort.by(Sort.Direction.ASC, "timestamp"))
                .output(DocumentOperators.valueOf("timestamp").shift(-1).defaultTo(-1))
                .as("previous")
                .build();

        DateOperators.DateDiff dateDiff = DateOperators.zonedDateOf("timestamp", DateOperators.Timezone.valueOf("America/Chicago"))
                .diffValueOf("previous", DateOperators.TemporalUnit.from(ChronoUnit.SECONDS));
        SetOperation dateDiffOperation = set("timeDifference").toValue(dateDiff);

        Aggregation aggregation = newAggregation(timestampWindowFields, dateDiffOperation)
                .withOptions(new AggregationOptions(true, false, 1));

        AggregationResults<String> results = mongoTemplate.aggregate(aggregation, "telemetry-debug", String.class);
        Document rawResults = results.getRawResults();

        assertFalse(rawResults.isEmpty());
    }

Adding a dummy field with a value like "new Date()" works as expected and the $dateDiff operator populates the field correctly. However when running the above code the field reference for "previous" appears in the aggregation pipeline as "$_id.previous", so I am guessing it is something that happens when looking up the target field.

Let me know if I can provide more details.

christophstrobl commented 1 month ago

Thank you @alex-ionescu-teamextension - I see the $_id. prefix now resulting how SetWindowFieldsOperation exposes the output to the next aggregation stage.