forcedotcom / SFDX-Data-Move-Utility

SFDMU is a cutting-edge Salesforce data migration tool for seamless org population from other orgs or CSV files. It handles all CRUD operations on multiple related objects in one go.
BSD 3-Clause "New" or "Revised" License
444 stars 74 forks source link

[QUESTION]-SFDMU Hanging Up On Large Batches of Records #697

Closed khouldsworth-cgi closed 5 months ago

khouldsworth-cgi commented 6 months ago

Describe the bug Currently, our team is migrating data between two SF full copy sandboxes. We multiple objects that exceed 1 million records. When the DMU begins to query those records (via Bulk API 2.0) the program looks like it fails. We have received a few errors, but nothing written to the SFDMU log.

One of these issues was a Node JS Heap issue which we referred to the knowledge base for SFDMU to resolve. We are now getting a different error -->

Error: Cannot create a string longer than 0x1fffffe8 characters at Buffer.toString (node:buffer:846:17) at Request. (C:\Users\kevin.houldsworth\AppData\Local\sf\node_modules\request\request.js:1128:39)
at Request.emit (node:events:530:35) at IncomingMessage. (C:\Users\kevin.houldsworth\AppData\Local\sf\node_modules\request\request.js:1076:12) at Object.onceWrapper (node:events:632:28) at IncomingMessage.emit (node:events:530:35) at endReadableNT (node:internal/streams/readable:1696:12) at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { code: 'ERR_STRING_TOO_LONG' } Error: read ECONNRESET at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) { errno: -4077, code: 'ECONNRESET', syscall: 'read' }

To Reproduce Attempt to migrate an object of 1 million records or possibly greater between two full copy sandboxes (This includes a majority of the objects fields and data). This is via an Upsert operation.

Expected behavior We expect that the SFDMU can successfully query these records via Bulk API.

SanatizedExportJSON.txt SanatizedLog.txt

I have attached a log file and an export.json file but they have been heavily sanitized. Please reach out so that there can be a discussion on these problems.

Thanks, Kevin H

jxbambrick commented 6 months ago

I'm getting the same error. As a test, I have configured just the Account object. There are about 1.5 million records in my instance.

Node: 20.11.1 SF version: @salesforce/cli/2.30.8 win32-x64 node-v20.11.1 SF DMU: 4.3.1 $env:NODE_OPTIONS="--max-old-space-size=24576" echo $env:NODE_OPTIONS

--max-old-space-size=24576

The code ran for 16 minutes and then the error happens.

sf sfdmu run --sourceusername placeholder--targetusername csvfile --path "C:\Users\josh.bambrick\Documents\SFDMU-GUI-APP-DATA\workspaces\client-Migration\client-Migration-v1" --filelog 1 --loglevel TRACE --apiversion 60.0

2024-03-15__21_41_20.log

Here is the error message: Error: Cannot create a string longer than 0x1fffffe8 characters at Buffer.toString (node:buffer:846:17) at Request. (C:\Users\josh.bambrick\AppData\Local\sf\node_modules\request\request.js:1128:39) at Request.emit (node:events:530:35) at IncomingMessage. (C:\Users\josh.bambrick\AppData\Local\sf\node_modules\request\request.js:1076:12) at Object.onceWrapper (node:events:632:28) at IncomingMessage.emit (node:events:530:35) at endReadableNT (node:internal/streams/readable:1696:12) at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { code: 'ERR_STRING_TOO_LONG' } Error: read ECONNRESET at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) { errno: -4077, code: 'ECONNRESET', syscall: 'read' } Command in progress... done

hknokh2 commented 6 months ago

Hello, Thank you for reaching out to me. I will take a look at your case and let you know if there are any updates. Cheers.

github-actions[bot] commented 6 months ago

This case has been marked as 'to-be-closed', since it has no activity for the 3 days.
It will be automatically closed in another 3 days of inactivity.

jxbambrick commented 6 months ago

Following up so this isn't closed out. Any updates on this issue?

hknokh commented 6 months ago

Hello

  1. First, @khouldsworth-cgi, I would like to mention you a couple of misconfigurations in your export.json:

    • The parallelBulkJobs and parallelRestJobs are global (script-level) properties, not object-level properties. Therefore, you need to move them to the JSON's root.
    • Note that SELECT Id FROM LIMIT 999999 isn't a correct query. Even if this object is set as excluded, you still need to correct this query string to ensure correct script parsing.
    • I would suggest not setting bulkApiV1BatchSize higher than 9500 (default) for the SFDMU.
  2. Regarding the tool limitations, unfortunately, I have no option to check SFDMU with such a large amount of records. Since SFDMU was essentially designed to help developers populate their sandboxes with small/medium datasets, it relies on a pure in-memory data processing model and does not utilize advanced techniques like JSON streaming. Additionally, my tool uses the jsforce library, which is known to have limitations when working with the Salesforce API. Please consider using other tools for large datasets, such as the standard SF data loader, which can export/import millions of records.

  3. Regarding the specific error message you received, it appears to be related to the jsforce library I'm using, which fails to request large records from the Salesforce org. Try reducing the amount of fields, which can decrease the size of the data requested per record.

In summary, this behavior is very expected due to the known limitations of this tool in processing large datasets. Unfortunately, I can't resolve this.

Best regards.

github-actions[bot] commented 5 months ago

This case has been marked as 'to-be-closed', since it has no activity for the 3 days.
It will be automatically closed in another 3 days of inactivity.

github-actions[bot] commented 5 months ago

This case has been closed, since it has no activity for the last 6 days. Feel free to reopen it, if you need more help.