opencb / opencga

An Open Computational Genomics Analysis platform for big data genomics analysis. OpenCGA is maintained and develop by its parent company Zetta Genomics. Please contact support@zettagenomics.com for bug report and feature requests.
Apache License 2.0
166 stars 97 forks source link

Payload document size is larger than maximum of 16793600 #880

Open nicholsn opened 6 years ago

nicholsn commented 6 years ago

When loading a large vcf file of 9325 samples, I get the error below that I am hitting the 16MB limit for document sizes in mongo.

Is there a known limit for the number of samples that can be in a single vcf file that is linked in the catalog?

org.bson.BsonMaximumSizeExceededException: Payload document size is larger than maximum of 16793600.
        at com.mongodb.connection.BsonWriterHelper.writePayload(BsonWriterHelper.java:66)
        at com.mongodb.connection.CommandMessage.encodeMessageBodyWithMetadata(CommandMessage.java:136)
        at com.mongodb.connection.RequestMessage.encode(RequestMessage.java:138)
        at com.mongodb.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:236)
        at com.mongodb.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:98)
        at com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:441)
        at com.mongodb.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:70)
        at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:192)
        at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:264)
        at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:126)
        at com.mongodb.operation.MixedBulkWriteOperation.executeCommand(MixedBulkWriteOperation.java:372)
        at com.mongodb.operation.MixedBulkWriteOperation.access$700(MixedBulkWriteOperation.java:65)
        at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:198)
        at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:189)
        at com.mongodb.operation.OperationHelper.withReleasableConnection(OperationHelper.java:433)
        at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:189)
        at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:65)
        at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:151)
        at com.mongodb.client.internal.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:894)
        at com.mongodb.client.internal.MongoCollectionImpl.executeUpdate(MongoCollectionImpl.java:885)
        at com.mongodb.client.internal.MongoCollectionImpl.updateMany(MongoCollectionImpl.java:585)
        at org.opencb.commons.datastore.mongodb.MongoDBNativeQuery.update(MongoDBNativeQuery.java:206)
        at org.opencb.commons.datastore.mongodb.MongoDBCollection.update(MongoDBCollection.java:382)
        at org.opencb.opencga.catalog.db.mongodb.FileMongoDBAdaptor.update(FileMongoDBAdaptor.java:183)
        at com.mongodb.operation.MixedBulkWriteOperation.executeBulkWriteBatch(MixedBulkWriteOperation.java:254)
        at com.mongodb.operation.MixedBulkWriteOperation.access$700(MixedBulkWriteOperation.java:65)
        at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:198)
        at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:189)
        at com.mongodb.operation.OperationHelper.withReleasableConnection(OperationHelper.java:433)
        at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:189)
        at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:65)
        at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:151)
        at com.mongodb.client.internal.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:894)
        at com.mongodb.client.internal.MongoCollectionImpl.executeUpdate(MongoCollectionImpl.java:885)
        at com.mongodb.client.internal.MongoCollectionImpl.updateMany(MongoCollectionImpl.java:585)
        at org.opencb.commons.datastore.mongodb.MongoDBNativeQuery.update(MongoDBNativeQuery.java:206)
        at org.opencb.commons.datastore.mongodb.MongoDBCollection.update(MongoDBCollection.java:382)
        at org.opencb.opencga.catalog.db.mongodb.FileMongoDBAdaptor.update(FileMongoDBAdaptor.java:183)
        at org.opencb.opencga.catalog.managers.FileManager.update(FileManager.java:869)
        at org.opencb.opencga.catalog.managers.FileManager.update(FileManager.java:878)
        at org.opencb.opencga.catalog.utils.FileMetadataReader.setMetadataInformation(FileMetadataReader.java:234)
        at org.opencb.opencga.catalog.managers.FileManager.privateLink(FileManager.java:2523)
        at org.opencb.opencga.catalog.managers.FileManager.link(FileManager.java:1032)
        at org.opencb.opencga.app.cli.main.executors.catalog.FileCommandExecutor.link(FileCommandExecutor.java:359)
        at org.opencb.opencga.app.cli.main.executors.catalog.FileCommandExecutor.execute(FileCommandExecutor.java:104)
        at org.opencb.opencga.app.cli.main.OpencgaMain.main(OpencgaMain.java:143)
pfurio commented 6 years ago

Hi, sorry for the delay. I've just had a look at it and I've been able of associating 100.000 samples to a single file and I did not see that exception. In fact, the file document stored containing those 100.000 samples took 2.5 MB, still far from the 16 MB limit.

Maybe there was some kind of bug in an earlier release causing that? Can you tell me which version were you using? Does it happen with the latest v1.4.0-rc1?

nicholsn commented 6 years ago

@pfurio I'll need to update to v1.4.0-rc1 and can give it another shot - will let you know how it goes.

Currently, I am using:

 {
          "Program": "OpenCGA (OpenCB)",
          "Git commit": "9fd61b2af122476e9f015c8743f54c2a2b486df1",
          "Description": "Big Data platform for processing and analysing NGS data",
          "Version": "1.4.0-beta",
          "Git branch": "v1.4.0-beta"
}
pfurio commented 6 years ago

I got exactly the same results using your commit version, so I'm unable to reproduce your issue. Please, let me know if you try again with the latest v1.4.0-rc1. Thanks.

iGaurav4 commented 4 years ago

I am getting this issue with version mongo:4.2.8-bionic.

org.bson.BsonSerializationException: Payload document size of is larger than maximum of 16793600. at com.mongodb.connection.BsonWriterHelper.writePayload(BsonWriterHelper.java:66) at com.mongodb.connection.CommandMessage.encodeMessageBodyWithMetadata(CommandMessage.java:133) at com.mongodb.connection.RequestMessage.encode(RequestMessage.java:147) at com.mongodb.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:245) at com.mongodb.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:98) at com.mongodb.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:441) at com.mongodb.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:80) at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:189) at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:264) at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:126) at com.mongodb.operation.MixedBulkWriteOperation.executeCommand(MixedBulkWriteOperation.java:372) at com.mongodb.operation.MixedBulkWriteOperation.executeBulkWriteBatch(MixedBulkWriteOperation.java:254) at com.mongodb.operation.MixedBulkWriteOperation.access$700(MixedBulkWriteOperation.java:65) at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:198) at com.mongodb.operation.MixedBulkWriteOperation$1.call(MixedBulkWriteOperation.java:189) at com.mongodb.operation.OperationHelper.withReleasableConnection(OperationHelper.java:433) at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:189) at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:65) at com.mongodb.Mongo$3.execute(Mongo.java:837) at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:1025) at com.mongodb.MongoCollectionImpl.executeDelete(MongoCollectionImpl.java:1003) at com.mongodb.MongoCollectionImpl.deleteMany(MongoCollectionImpl.java:584) at org.springframework.data.mongodb.core.MongoTemplate$11.doInCollection(MongoTemplate.java:1630) at org.springframework.data.mongodb.core.MongoTemplate$11.doInCollection(MongoTemplate.java:1592) at org.springframework.data.mongodb.core.MongoTemplate.execute(MongoTemplate.java:524) at org.springframework.data.mongodb.core.MongoTemplate.doRemove(MongoTemplate.java:1592) at org.springframework.data.mongodb.core.MongoTemplate.remove(MongoTemplate.java:1580) at org.springframework.data.mongodb.core.MongoTemplate.doFindAndDelete(MongoTemplate.java:1923) at org.springframework.data.mongodb.core.MongoTemplate.findAllAndRemove(MongoTemplate.java:1905) at org.springframework.data.mongodb.core.MongoTemplate.findAllAndRemove(MongoTemplate.java:1897) at com.dynamediation.reporting.service.AdvancedMetricsProcessor.getAllMetricsPackets(AdvancedMetricsProcessor.java:299) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

iGaurav4 commented 3 years ago

I was using findAllAndRemove function that was extracting all the data ( approx 2.5 gb) . I added limit that solved the issue.