forcedotcom / dataloader

Salesforce Data Loader
BSD 3-Clause "New" or "Revised" License
483 stars 293 forks source link

Dataloader freezes at random moments on Mac #507

Closed StasD closed 2 years ago

StasD commented 2 years ago

Hi, I have an issue with the Dataloader constantly freezing on Mac. Then I have to use "Force Quit" to close it... This can happen at any stage: after I press some button, make a selection, or in between processing batches. Lately I switched to "Bulk API Mode", this seems to have reduced the probability of freezing happening during actual processing, but freezing still can happen at the end during the "verification" stage. Mac OS version is 12.4 (Monterey). Dataloader version is 55.0.1. (Also used 54.0.0 for some time before it, the freezings were happening to it also). As to Java, I have latest versions of JDK 8, 11, and 17 from Oracle installed on my machine, and version 17 was used by the Dataloader, as I understand ("java --version" command in Terminal was returning the info about version 17). Today I was trying to experiment with various java versions: disabled 17, so that 11 was used, then installed the 11 version from Zulu (and disabled 11 from Oracle), nothing helped: the freezings still happen. Can you please tell me at least how can I debug this, to find out what is the underlying issue. Is there an error log somewhere?

ashitsalesforce commented 2 years ago

Hi @StasD,

Based on what you are describing and your experiments, JRE/JDK vendor or version are unlikely to be causing this issue.

What is the CPU architecture of your Mac? Intel x86 or ARM M1/M2?

Look at your system memory to make sure it is not at or near capacity. For example, here is a system whose memory use is near-capacity (14.08GB out of 16GB used). When data loader executes a large job on such a system, the OS starts paging, thereby slowing down responsiveness of data loader to the point where it appears "frozen".

Screen Shot 2022-07-08 at 9 00 46 AM

What operation are you performing? If it is not an Export or Export All operation, what's the size of your CSV file? If it is an Export operation, your query result size may be large enough to cause paging due to near-capacity memory use across all running apps and additional heap space usage by data loader to process large extraction result.

If these checks do not help, you can try using jconsole to see data loader's heap space use and whether it is having a full garbage collection (GC). A full GC typically happens either because a java process' maximum allocated heap space is insufficient or because the java process has a memory leak. A tool such as jconsole can help detect occurrences of full GC. To use it, run the command ps -ef | grep swt.nativelib.inpath from command terminal if you are running data loader in interactive (GUI) mode and note down its process id (pid). Next, run jconsole <pid of data loader> to launch JConsole. Here is the screenshot of JConsole attached to a running data loader process:

Screen Shot 2022-07-08 at 9 26 10 AM

Here is the screenshot of JConsole with selected "Memory" tab:

Screen Shot 2022-07-08 at 9 26 24 AM

You can see that garbage collections (GC) are occurring because of the fluctuations in the memory use. However, a full GCs are not occurring because a full GC usually occurs when the memory use goes to max heap (4.3 Gb in the example above) and then drops.

StasD commented 2 years ago

Hi @ashitsalesforce,

Thanks for the detailed answer.

I compiled version 55.0.2 several days ago using the guide on the front page and it seemed to work fine for a while, but today it froze again.

So I used your instructions and ran jconsole on it (after it froze).

I suspect (and previously suspected) that the problem is not with memory, but that the program encounters some network error (such as timeout), and then cannot recover from it for whatever reason. (Instead, it should probably show an error in a message box or something like that).

Jconsole seems to confirm this:

This is a screenshot

Also, for another thread there is a text like this:

Name: RMI TCP Connection(4)-127.0.0.1 State: TIMED_WAITING on com.sun.jmx.remote.internal.ArrayNotificationBuffer@505d8692 Total blocked: 3 Total waited: 10

Stack trace: java.base@17.0.3.1/java.lang.Object.wait(Native Method) java.management@17.0.3.1/com.sun.jmx.remote.internal.ArrayNotificationBuffer.fetchNotifications(ArrayNotificationBuffer.java:449) java.management@17.0.3.1/com.sun.jmx.remote.internal.ArrayNotificationBuffer$ShareBuffer.fetchNotifications(ArrayNotificationBuffer.java:227) java.management@17.0.3.1/com.sun.jmx.remote.internal.ServerNotifForwarder.fetchNotifs(ServerNotifForwarder.java:275) java.management.rmi@17.0.3.1/javax.management.remote.rmi.RMIConnectionImpl$4.run(RMIConnectionImpl.java:1271) java.management.rmi@17.0.3.1/javax.management.remote.rmi.RMIConnectionImpl$4.run(RMIConnectionImpl.java:1269) java.management.rmi@17.0.3.1/javax.management.remote.rmi.RMIConnectionImpl.fetchNotifications(RMIConnectionImpl.java:1275) java.base@17.0.3.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) java.base@17.0.3.1/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) java.base@17.0.3.1/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.base@17.0.3.1/java.lang.reflect.Method.invoke(Method.java:568) java.rmi@17.0.3.1/sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:360) java.rmi@17.0.3.1/sun.rmi.transport.Transport$1.run(Transport.java:200) java.rmi@17.0.3.1/sun.rmi.transport.Transport$1.run(Transport.java:197) java.base@17.0.3.1/java.security.AccessController.executePrivileged(AccessController.java:807) java.base@17.0.3.1/java.security.AccessController.doPrivileged(AccessController.java:712) java.rmi@17.0.3.1/sun.rmi.transport.Transport.serviceCall(Transport.java:196) java.rmi@17.0.3.1/sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:587) java.rmi@17.0.3.1/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:828) java.rmi@17.0.3.1/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:705) java.rmi@17.0.3.1/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$262/0x0000000800f76100.run(Unknown Source) java.base@17.0.3.1/java.security.AccessController.executePrivileged(AccessController.java:776) java.base@17.0.3.1/java.security.AccessController.doPrivileged(AccessController.java:399) java.rmi@17.0.3.1/sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:704) java.base@17.0.3.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) java.base@17.0.3.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) java.base@17.0.3.1/java.lang.Thread.run(Thread.java:833)

There seems to be some thread deadlocks there.

I don't know how to extract all the information from Jconsole so that I could send it to you (if it can help in debugging the issue)

My PC is 15'' MacBook Pro 2016, with the latest macOS installed. It has 16GB of memory.

StasD commented 2 years ago

Wow... While I was playing with Jconsole and writing the previous comment, the program actually finished...

screenshot

It took it 9 minutes to insert a few records.

ashitsalesforce commented 2 years ago

Hi @StasD ,

We can eliminate lack of memory or a deadlock as a likely root-cause for the issue. Assuming that you have a good network speed, let's go through a few questions such as:

Thanks.

ashitsalesforce commented 2 years ago

closing the issue assuming it is resolved.