flapdoodle-oss / de.flapdoodle.embed.mongo

...will provide a platform neutral way for running mongodb in unittests.
Apache License 2.0
907 stars 161 forks source link

Aggregation match with regex doesn't work after unwind / group #441

Closed brenthaertlein closed 1 year ago

brenthaertlein commented 1 year ago

When trying to compose an aggregation using Spring Data Mongo, it appears that including unwind / group stages in the aggregation pipeline causes a subsquent match with regex to... not work.

Given these Gradle dependencies:

plugins {
    id("org.springframework.boot") version "2.7.6"
    kotlin("jvm") version "1.6.21"
    kotlin("plugin.spring") version "1.6.21"
}

dependencies {
    implementation("org.springframework.boot:spring-boot-starter-data-mongodb")
    testImplementation("de.flapdoodle.embed:de.flapdoodle.embed.mongo")
}

Which results in using these versions:

de.flapdoodle.embed:de.flapdoodle.embed.package:1.0.10
de.flapdoodle.embed:de.flapdoodle.embed.mongo:3.4.11
de.flapdoodle.embed:de.flapdoodle.embed.process:3.1.15

Given this application configuration:

spring:
  mongodb:
    embedded:
      version: 5.0.12 #same behavior with 4.2.22

Given some class like:

@Document
class Example {
  @Id
  ObjectId id;
  String name;
  List<String> elements;

  // getters & setters
}

Here is some pseudocode, but I can set up an example project next week if it is necessary.

class Service {
  @Autowired MongoTemplate mongoTemplate

  public void search() {
    mongoTemplate.aggregate(
      Example::class,
      Aggregation.newAggregation(
        Example::class,
        Aggregation.unwind("elements"),
        Aggregation.sort(Sort.Direction.ASC, "elements"),
        Aggregation.group("_id").push("elements").as("elements").first("name").as("name"),
        Aggregation.match(Criteria.where("name").regex(Pattern.compile(name, Pattern.CASE_INSENSITIVE)))
      )
    )
  }
}

In the above example, I would expect the $match stage to regex match against name which should be a top level property of each document in the pipeline after $group.

When I run my server and MongoDB locally, I get the behavior I expect. However, our integration tests for these types of queries failed after I converted from mongoTemplate.find to mongoTemplate.aggregate. Initially I was very perplexed, and found that internally Spring Data Mongo seems to use the $regularExpression operator which is part of the MongoDB Extended JSON (v2) specification.

However, I found that simply by removing the $group stage and the preceding stages, the $match stage does work in the integration tests. Unfortunately, we need to $unwind, $sort and $group because we don't have access to $sortArray which was introduced in MongoDB 5.2.

The above code generates this pipeline, but returns no results even when it should:

[
    {
      "$unwind": "$elements"
    },
    {
      "$sort": {
        "elements": 1
      }
    },
    {
      "$group": {
        "_id": "$_id",
        "elements": {
          "$push": "$elements"
        },
        "name": {
          "$first": "$name"
        }
      }
    },
    {
      "$match": {
        "name": {
          "$regularExpression": {
            "pattern": "Jacksonville",
            "options": "i"
          }
        }
      }
    }
  ]

When the $group and preceding stages are removed, we get this pipeline which works as expected:

[
    {
      "$match": {
        "name": {
          "$regularExpression": {
            "pattern": "Jacksonville",
            "options": "i"
          }
        }
      }
    }
  ]

Let me know what you think and happy holidays!

brenthaertlein commented 1 year ago

Nevermind, this was entirely my bad! I just happened to run into a case with my tests where $unwind on an empty array produces no results and conflated that with a "embedded test behavior" issue.

michaelmosmann commented 1 year ago

@brenthaertlein no problem:) .. sometimes it is helpful to explain the problem to an other person just to solve it by yourself:)