alibaba / DataX

DataX是阿里云DataWorks数据集成的开源版本。
Other
15.55k stars 5.35k forks source link

There seems to be a bug in datax's extraction of mongodb data. You need to set the batchSize for each find(). Will cause memory overflow。 #2108

Open zhangconan opened 2 months ago

zhangconan commented 2 months ago

In MongodbReader, data is obtained through dbCursor = col.find(filter).iterator();, but the amount of data obtained in each batch is not set, which will cause memory overflow. Need to be modified like this: dbCursor = col.find(filter).batchSize(1000).iterator();

FuYouJ commented 2 months ago

最近遇到两个mongo的问题了,我会在5月中旬之前修复

zhangconan commented 2 months ago

最近遇到两个mongo的问题了,我会在5月中旬之前修复

GOOD!!!

zhangconan commented 6 days ago

最近遇到两个mongo的问题了,我会在5月中旬之前修复

你好,请问这两个问题修复了吗?