typegoose / mongodb-memory-server

Manage & spin up mongodb server binaries with zero(or slight) configuration for tests.
https://typegoose.github.io/mongodb-memory-server/
MIT License
2.56k stars 185 forks source link

GridFS streams not working correctly #839

Closed framp closed 8 months ago

framp commented 8 months ago

Versions

package: mongo-memory-server

What is the Problem?

When using GridFS, the server closes unexpectedly.

Log:

connection 1 to 127.0.0.1:40865 closed
      ---
      stack: |
        Connection.onClose (node_modules/mongodb/src/cmap/connection.ts:338:13)
        Socket.<anonymous> (node_modules/mongodb/src/cmap/connection.ts:243:42)
      at:
        line: 338
        column: 13
        file: node_modules/mongodb/src/cmap/connection.ts
        function: Connection.onClose
      type: MongoNetworkError
      tapCaught: uncaughtException
      test: my test
      source: |2
            for (const op of this[kQueue].values()) {
              op.cb(new MongoNetworkError(message));
        ------------^
            }
      ...

I tried to pinpoint the exact piece of code which triggers it but I couldn't reproduce easily (logs below).

I noticed though that GridFS streams seem to behave differently compared to a normal mongodb server.

Normally, once a file has been uploaded to GridFS the close event is emitted on the upload stream and you can instantly download the file. When I do that with mongodb-memory-server I cannot find the file unless I wait for 6ms.

Code Example

const { MongoClient, ObjectId, GridFSBucket } = require("mongodb");
const { MongoMemoryServer } = require("mongodb-memory-server");
const Readable = require("stream").Readable;

(async () => {
    const mongod = await MongoMemoryServer.create({
      binary: { version: "4.4.18" }
    });

    const db = (await MongoClient.connect(mongod.getUri('data-test'))).db();

    const name = "test";
    const data = new GridFSBucket(db, { bucketName: "data" });

    const dataCursor = data.find({
        _id: { $in: [name] }
    });
    for await (const { _id } of dataCursor) {
        console.log({_id})
        await data.delete(_id);
    }

    console.log(1);

    const payload = { ok: 200 };

    const stream = new Readable({ emitClose: true });
    stream.push(Buffer.from(JSON.stringify(payload), 'utf8'));
    stream.push(null);
    stream.ended = true;
    stream.pipe(
        data.openUploadStreamWithId(name, `${name}.json`, {
            chunkSizeBytes: 8388608
        })
    );
    await new Promise((resolve) => stream.on("close", resolve));
    console.log(2);

    // await new Promise((resolve) => setTimeout(resolve, 6)); // comment or uncomment to make it work

    const stream2 = data.openDownloadStream(name);
    const result = await new Promise((resolve, reject) => {
        const chunks = [];
        stream2.on("data", (chunk) => chunks.push(Buffer.from(chunk)));
        stream2.on("error", (err) => reject(err));
        stream2.on("end", () => resolve(JSON.parse(Buffer.concat(chunks).toString("utf8"))));
    });

    console.log(result);
})();

Debug Output

This is from running tests logs-full.txt

This is from running the sample code logs-test.txt

Do you know why it happenes?

no

hasezoey commented 8 months ago

mongodb(the binary version): 6.0.9 logs-test.txt

why use a different mongodb version than in logs-full.txt?

I noticed though that GridFS streams seem to behave differently compared to a normal mongodb server.

does anything change if you use instance: { storageEngine: "wiredTiger" }? if yes, then it is a ephemeralForTest issue - which has been removed in mongodb 7.0.0

When I do that with mongodb-memory-server I cannot find the file unless I wait for 6ms.

at least in the case of a replset, maybe you need to use writeConcern: majority?

When using GridFS, the server closes unexpectedly. Log: connection 1 to 127.0.0.1:40865 closed \

i have no clue why that error gets thrown, maybe try to upgrade your driver? at least it is not the instance / MMS shutting down (it is a client error, guessing by the logs)

framp commented 8 months ago

Thanks for the response!

why use a different mongodb version than in logs-full.txt?

Oh my bad, bad copy-paste, it doesn't change the result though.

does anything change if you use instance: { storageEngine: "wiredTiger" }? if yes, then it is a ephemeralForTest issue - which has been removed in mongodb 7.0.0

Nope, it's still not finding the file right after the stream is processed.

Pretty strange error, this may be an issue upstream with mongo and it's just not visible for some weird timing reason. Maybe mongodb-memory-server is much faster at returning a response and by the time normal mongo ends the stream the data has already been saved in GridFS.

hasezoey commented 8 months ago

Maybe mongodb-memory-server is much faster at returning a response and by the time normal mongo ends the stream the data has already been saved in GridFS.

what exactly do you mean with "normal mongo", do you mean atlas or a manually run binary or something else?

framp commented 8 months ago

My bad, I was trying with a mongodb instance on Atlas

I tried with a local instance and the behaviour is consistent with mongodb-memory-server so definitely an upstream problem!

Thanks a lot!