Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.84k stars 2.94k forks source link

the function of reading short-circuit can`t turn on #17573

Open morpheusyu opened 1 year ago

morpheusyu commented 1 year ago

my presto is deployed in docker individually,alluxio is deployed in another docker container individually, I create a path /mnt/ramdisk in host machine,and execute 'sudo chmod a+w /mnt/ramdisk' . and hang on this path onto container of alluxio as path of /opt/ramdis,and do the same to container of presto , alluxio can start well,when presto query hive table ,I can get the query result ,I try to turn on the short circuit read but failed becaues web ui indicator BytesReadLocalThroughput is always 0 no matter how i query,I dont know where am i wrong,could u give me some advice. thanks my alluxio-works alluxio-site.properties has mark the user.hostname and work.hostname ,and them are the same . below is my alluxio-master`s alluxio-site.properties 671686188924_ pic

morpheusyu commented 1 year ago

and when finish the query and I can see some file appear in the directory /mnt/ramdisk of host machine ,and in path /opt/ramdisk of both containers also appear those file ,I can infer than query result exist in memory ,but presto still query from ufs ,can`t do it from short-circuit read

jiacheliu3 commented 1 year ago

There are two reasons why the short circuit is not working:

  1. You client process (Alluxio client runs in your Presto container Presto process, in your case) cannot see or access the worker's cache files. So the client simply cannot short-circuit read/write to the worker.
  2. Your client does not find a local(co-located) worker process. When you are using short circuit, the client must find a worker with the same hostname (so the client knows the worker is local and can short circuit).

I'm 99% confident the root cause is in the two reasons above.

This talk may help you understand how short circuit and domain socket works in K8s. Altough it's talking about Spark, the theory is the same for Presto. https://www.alluxio.io/resources/videos/community-office-hour-improving-data-locality-for-spark-jobs-on-kubernetes-using-alluxio/ https://www.slidestalk.com/Alluxio/Spark_Alluxio_K8s

But note that there's a difference between short circuit(alluxio.user.short.circuit.preferred=true) and domain socket(alluxio.user.short.circuit.preferred=false). Short circuit means directly reading worker's cache files. Domain socket means reading the domain socket shared with the worker (so no need to see or access the worker's cache files).

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.