同步mongodbgridfs - Githubissues

chenweiyhui commented 4 years ago

您好，大佬，请问一下同步book表时， 1、是否需要先在eslasticsearch 建立book 索引？，当我没建的时候同步的时候会 unhandled rejection Error:[ indes not found _exceptionl no such indes, with I resource. type=indes or alias"& resource.mymaterials"& index uuid="na "& index=mymaterials at respond (F:\ projecttools \ node mongodb-es-connector-master uode moduleslelasticsearclh\ sre(lib\ transport. js:349:1 at checkRespForPailure (F:\ projecttools node-mongodb-es-connector-master node modules lelasticsearch sre 1ib transat ittpcounector. canonymous (F:\ pro jecttools node-mongodb-es-comector -uaster node modules elasticsearcli sre lib uectors http. js:173:) at Unzip. wrapper (F:\ projecttools node mongodb-es-comeetll master node modules lodash lodasl. js:4035:19) at emit ione levents. js:111:20) at jnzip. emit (events. ja:208:7) at endReadable\7 (stream readable. js:1056:12) at combinedJickCallbaclk internal process nest ticlk. js:188:11) at process._tiekcallbaclk finternal/process /nest _tick. js 180:9) 2、如果按照新建索引，数据并无法插入进去

zhr85210078 commented 4 years ago

可以创建索引，也可以不创建索引如果不创建索引，会根据你要同步的MongoDB数据节点，自动生成索引以及索引里面的mapping 如果创建索引，需要事先定义好结构，然后再同步你先确定你的配置文件是否正确，MongoDB里面是否有数据，elasticsearch是否成功运行

chenweiyhui commented 4 years ago

MongoDB，elasticsearch 都是正常的，参照您提供的user是可以同步，配置文件是参照mybook.json 进行调整的，只是我这边的存储结构跟您这边的有点不同，files这个集合没有你这边的那个metadata，二进制是分片存储到 chunks集合中

zhr85210078 commented 4 years ago

补充一句，如果是要同步附件，需要事先在elastic里定义好pipeline，然后在配置文件里面配置好定义，最后启动同步工具方可，具体参考readme里面的说明

zhr85210078 commented 4 years ago

请仔细比较我提供的两个例子是不同的，只是为了证明可以通过pipeline来同步附件下面是定义pipeline

chenweiyhui commented 4 years ago

嗯嗯，事先已经建立了管道pipeline，仔细看过您这边的例子的，之间是直接按照你这个建立，我这边应该要结合我的mongodb集合结构，建立管道，对应的字段不同，所以可能没取到正确的值 info:2019-12-24 15:33:39 |info http://localhost:9200 |mymaterials insert-oplog Method:pipelineAndAttachmentsBul _no attachments found (undefined/undefined),_DocId:5e01a246032735160bdccld7,DocId is :5e01a246032735160bdccld7 info:2019-12-24 15:33:39 info http://localhost:9200 mymaterials insert-oplog Method:pipelineAndAttachmentsBul DocId is :5e01a246032735160bdccld7

zhr85210078 commented 4 years ago

现在的问题是你的MongoDB里面没有附件信息，fs.files和fs.chuncks里面没有数据 https://github.com/zhr85210078/node-mongodb-es-connector/blob/master/test/test.js 关于如何把一本书存到MongoDB,这个就是例子，注释的那些

chenweiyhui commented 4 years ago

有的我的数据集合是materials 对应的，其他materials.chunks，materials.files是有数据的，

zhr85210078 commented 4 years ago

必须是fs.files和fs.chuncks...... https://blog.csdn.net/yangyatou1991/article/details/71330923

chenweiyhui commented 4 years ago

................ 哎，呜呜，我们的附件使用GridFS 存储的附件，materials，materials.chunks，materials.files 是对应的集合的。使用java api 是能下载的

zhr85210078 commented 4 years ago

理论上使用MongoDB GridFS,对应的fs.files和fs.chuncks上面都应该有数据啊？原理上跟你的materials.chunks，materials.files没太大关系才对 fs.files里面的metadata.MainCollectionName和metadata.MainID就是找对应关系的... 这个工具使用的依赖包是这个 https://www.npmjs.com/package/gridfs-stream

chenweiyhui commented 4 years ago

嗯嗯，我再多研究几遍你那里的demo看看，非常感谢您的解答，研究之后再来向你请教

chenweiyhui commented 4 years ago

您好，请问一下mongoPromise.js 里面getGridFsArray 是在哪里请求的呢？找了半天没找到，这些参数从哪传递过来的？这里只有获取mainid 我想获取一下id

zhr85210078 commented 4 years ago

mainId就是主文档Id, 例子：books->{_id:"bookid",name:"bookName"} fs.files->{_id:"files_Id",metadata:{mainId:"bookid",MainCollectionName:"books"}} fs.chuncks->{_id:"chuncksId",files_Id:"files_Id"}

getGridFsArray这个方法主要用在main.js和oplogFactory.js里面

chenweiyhui commented 4 years ago

感谢大佬的支持，已经差不多实现，请问一下大佬，您的工具，不支持指定时间段的查询嘛？我看里面是有两个参数的，但具体好像没有生效

zhr85210078 commented 4 years ago

理论上是可以的，因为MongoDB的objectId本身就是带时间戳生成的，我只是把时间转换成ObjectId，然后再比较objectId做查询

zhr85210078 / node-mongodb-es-connector

同步mongodbgridfs #29