cdarlint / winutils

winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
1.82k stars 2.1k forks source link

Hadoop/Spark slower on Windows VS Linux #42

Open khaledmahmoudintel opened 4 months ago

khaledmahmoudintel commented 4 months ago

Hi,

I have two machines that have identical Hardware. CPU, RAM, and BIOS configurations are exactly the same. I am running Spark 3.3.1 with Hadoop 3.3.1. The benchmark is also exactly the same. I am not using any HDFS at all.

Problem: Spark on Windows runs slower than Linux

Any idea why Windows implementation is slower? What is exactly inside hadoop.dll and winutils.exe.

luanchao19990 commented 4 months ago

Thank you for your email.We’ll disposal and reply as soon as possible~Best Wishes~

pengfei99 commented 1 month ago

I share the same experience. In my linxu server the hdfs dfs -ls / command return the result immediately. In the windows server, it takes about 1 mins. The windows server has better hardware the the linux server.