GraphQLGuide / apollo-datasource-mongodb

Apollo data source for MongoDB
MIT License
285 stars 64 forks source link

Two queries by nested fields return duplicated result #79

Open nomshar opened 3 years ago

nomshar commented 3 years ago

So there is a document { "top-field": "abc", "nested-fields": { "field1": "value", "field2": "" } }

I'm sending two queries: const itemsA = this.findByFields({ "nested-fields.field1": "value"}); const itemsB = this.findByFields({ "nested-fields.field2": "value"});

itemsB contains data, but shouldn't.

The issue, as I realized, comes from cache.js orderDocs and getNestedValue method. It processes correctly filters like { nested-fields: { field1: "value" }}, but cannot process concatenated fields.

lorensr commented 3 years ago

Thanks for reporting! What version? This may have been fixed in 0.5.2

nomshar commented 3 years ago

Thanks for reporting! What version? This may have been fixed in 0.5.2

"_id": "apollo-datasource-mongodb@0.5.2"

lorensr commented 3 years ago

I'm unable to reproduce—I added a passing test case that I think does what you said?

https://github.com/GraphQLGuide/apollo-datasource-mongodb/blob/08e1c076739867a9e73a707adc66767a7e8525d7/src/__tests__/datasource.test.js#L132-L136

If you could modify/add a test case that demonstrates the bug, that would be a big help ☺️

potpe commented 2 years ago

Hello, I believe we hit the same problem as described above where I work, but I was now able to reproduce it (against version 0.5.4) , write some rough test cases for it and also have a potential fix.

Please check diff with my forked version here: https://github.com/GraphQLGuide/apollo-datasource-mongodb/compare/master...potpe:apollo-datasource-mongodb:filtering-with-nested-fields

The issue only happens when underlying DataLoader library batches multiple invocations of findByFields method together and when fields provided are nested fields.

Consider following documents to be present in MongoDb:

      {
        name: 'Bob',  
        nested: { field1: 'value1', field2: 'value1' }
      },
      {
        name: 'Charlie',
        nested: { field1: 'value2', field2: 'value2' }
      },
      {
        name: 'Dan',
        nested: { field1: 'value1', field2: 'value2' }
      },

findByFields invocations as follows:

findByFields({ 'nested.field1': 'value1', 'nested.field2': 'value1' })
findByFields({ 'nested.field1': 'value2', 'nested.field2': 'value2' })
findByFields({ 'nested.field1': 'value3', 'nested.field2': 'value3' })

These get optimized into a single Mongo query like this:

filter:  {
      'nested.field1': { '$in': [ 'value1', 'value2', 'value3' ] },
      'nested.field2': { '$in': [ 'value1', 'value2', 'value3' ] }
}

Results from Mongo are returned and are matching all documents at this stage:

results:  [
      {
        name: 'Bob',
        nested: {  field1: 'value1', field2: 'value1'}
      },
      {
        name: 'Charlie',
        nested: { field1: 'value2', field2: 'value2' }
      },
      {
        name: 'Dan',
        nested: { field1: 'value1', field2: 'value2' }
      }
    ]

orderDocs method attempts to do mapping between docs from results above, to fieldsArray, which is an array with original fields passed into findByFields method:

fieldsArray [
      { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] },
      { 'nested.field1': [ 'value2' ], 'nested.field2': [ 'value2' ] },
      { 'nested.field1': [ 'value3' ], 'nested.field2': [ 'value3' ] }
    ]

2 problems happen in orderDocs function:

a) call to getNestedValues method fails to extract 'value1' from object { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] }, given fieldName of nested.field1. Instead undefined is returned.

Due to this, all docs from the result are attached to all 3 items in fieldsArray, effectively causing duplication.

orderedDocs:  [
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
      [
        { name: 'Bob', nested: [Object] },
        { name: 'Charlie', nested: [Object] },
        {  name: 'Dan', nested: [Object] }
      ],
    ]

b) even if the above call returned correct value, call to const docValue = doc[fieldName]; also fails to retrieve value from the doc, when fieldName is nested.field1. This means that some of the results would not filtered out correctly.

Branch in my forked version is trying to fix these 2 issues.

Expected content of orderedDocs would be as follows:

orderedDocs:  [
      [ 
         { name: 'Bob', nested: { field1: 'value1', field2: 'value1' } }  --matches filter { 'nested.field1': [ 'value1' ], 'nested.field2': [ 'value1' ] },
      ],
      [ 
          { name: 'Charlie', nested: { field1: 'value2', field2: 'value2' } } --matches filter{ 'nested.field1': [ 'value2' ], 'nested.field2': [ 'value2' ] },
      ],
      [
     -- no document should match filter { 'nested.field1': [ 'value3' ], 'nested.field2': [ 'value3' ] },
      ]
    ]

Hopefully the above makes sense, but let me know if anything needed any further clarification.

potpe commented 1 year ago

Hi @lorensr !

Could you please look at the above findings and the proposed fix? If the fix made sense to you, I could create a pull request..

Thank you!

lorensr commented 1 year ago

A PR would be great, thanks!