360Learning / mongo-bulk-data-migration

Mongo NodeJs data migration software open source project - 1 line migration, resumable, fast (bulk), automatic rollback
MIT License
3 stars 0 forks source link

Support advanced aggregate pipeline to update outside the aggregate source #12

Open elisap360 opened 2 weeks ago

elisap360 commented 2 weeks ago

Context

I needed to update a field on collection companies by filtering elements on group collection. With the current DataMigration architecture, the target collection must be the one we want to update. But in this case, it proved counter-productive, the join was too big and the script crashed.

Sample script ```ts import { MongoBulkDataMigration } from "@360-l/mongo-bulk-data-migration"; import { DataMigrationProcess } from "@backend/utils/env/load/script/datamigration"; import { $nex } from "@backend/utils/mongo/utils"; import type { Company } from "@backend/utils/mongo/definitions"; const MIGRATION_ID = "migration_20240527T110000Z-activateMagicLinkForNonCustomizedUrlCompanies"; const processHandler = new DataMigrationProcess({ name: MIGRATION_ID }, buildMigration); void processHandler.handleMigration(); export function buildMigration() { return new MongoBulkDataMigration({ db: global._mongoClient.db(), id: MIGRATION_ID, collectionName: "companies", projection: { loginWithMagicLink: 1 }, query: [ { "$lookup": { "from": "groups", "localField": "_id", "foreignField": "company", "as": "groups" } }, { "$project": { "loginWithMagicLink": 1, "rootGroup": { "$filter": { "input": "$groups", "as": "rootGroup", "cond": { "$eq": [ "others", "$$rootGroup.sys" ] } } } } }, { "$unwind": { "path": "$rootGroup" } }, { "$project": { "loginWithMagicLink": 1, "subdomain": "$rootGroup.subdomain", "url": "$rootGroup.url" } }, { "$match": { $or: [ { loginWithMagicLink: $nex }, { loginWithMagicLink: false } ], $and: [{ $or: [ { subdomain: $nex }, { subdomain: "" } ] }, { $or: [ { url: $nex }, { url: "" } ] } ] } } ], update: { $set: { loginWithMagicLink: true } } }); } ```

TO DO

Sample script ```ts return new MongoBulkDataMigration({ db: global._mongoClient.db(), id: MIGRATION_ID, collection: { query: "groups", update: "companies", relatedIdKey: "company", // groups._id -> companies.company }, projection: { company:1, sys:1, subdomain:1, url:1 }, query: { sys: "others", $and: [...] }, update: async (group) => { const company = await companyRepository.fetchById(group.company, ["_id", "loginWithMagicLink"]); if (! company.loginWithMagicLink) { companyIdsToSave.push(company._id); await _db.companies.update(company._id, { $set: { loginWithMagicLink: true } }); } }) }); ```
elisap360 commented 2 weeks ago

@pp0rtal I finally created a ticket ! I don't have a better idea than what you proposed, it would be very helpful already =)