Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.82k stars 2.93k forks source link

spark读写alluxio后 目录和文件 权限都变成了777 且 为 pinned #18610

Open haoranchuixue opened 4 months ago

haoranchuixue commented 4 months ago

Alluxio Version: 2.8.1

Describe the bug 环境描述: Alluxio版本 2.8.1 Spark版本 3.5.1 Iceberg版本 1.4.3 Alluxio挂载OSS, spark继承了iceberg。Spark通过Alluxio读写虚拟湖数据(alluxio目录)

问题描述 默认安装完新建目录为775,文件为644。 只要spark任务启动后读写alluxio,不管目录还是文件立马变成777且pin为YES alluxio-pin01

Alluxio配置文件 alluxio-site.properties alluxio_master_hostname=bigdata-102.whale.com alluxio.master.mount.table.root.ufs=hdfs://beluga/data/alluxio alluxio.worker.tieredstore.levels=3 alluxio.user.block.write.location.policy.class=alluxio.client.block.policy.MostAvailableFirstPolicy alluxio.master.embedded.journal.addresses=bigdata-101.whale.com:19200,bigdata-102.whale.com:19200,bigdata-103.whale.com:19200 alluxio.tmp.dirs=/data/alluxio/tmp alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards=2 alluxio.user.metadata.cache.enabled=false alluxio.user.file.create.ttl.action=FREE alluxio.user.file.replication.max=1 alluxio.worker.tieredstore.levels.content= alluxio.worker.tieredstore.level0.alias=MEM alluxio.worker.tieredstore.level0.dirs.path=/mnt/ramdisk alluxio.worker.tieredstore.level0.dirs.quota=20G alluxio.worker.tieredstore.level0.watermark.high.ratio=0.9 alluxio.worker.tieredstore.level0.watermark.low.ratio=0.7 alluxio.worker.tieredstore.level1.alias=SSD alluxio.worker.tieredstore.level1.dirs.path=/data/alluxio alluxio.worker.tieredstore.level1.dirs.quota=100GB alluxio.worker.tieredstore.level1.watermark.high.ratio=0.9 alluxio.worker.tieredstore.level1.watermark.low.ratio=0.7 alluxio.worker.tieredstore.level2.alias=HDD alluxio.worker.tieredstore.level2.dirs.path=/data01/alluxio,/data02/alluxio,/data03/alluxio,/data04/alluxio,/data05/alluxio,/data06/alluxio,/data07/alluxio,/data08/alluxio alluxio.worker.tieredstore.level2.dirs.quota=1TB,1TB,1TB,1TB,1TB,1TB,1TB,1TB alluxio.worker.tieredstore.level2.watermark.high.ratio=0.9 alluxio.worker.tieredstore.level2.watermark.low.ratio=0.7 alluxio.worker.allocator.class=alluxio.worker.block.allocator.MaxFreeAllocator alluxio.master.security.content= alluxio.master.security.impersonation.hdfs.users= alluxio.master.security.impersonation.hdfs.groups= alluxio.master.security.impersonation.yarn.users= alluxio.master.security.impersonation.yarn.groups= alluxio.master.security.impersonation.hive.users= alluxio.master.security.impersonation.hive.groups= alluxio.master.security.impersonation.kyuubi.users= alluxio.master.security.impersonation.kyuubi.groups= alluxio.job.worker.threadpool.size=60 alluxio.master.web.port=19999 alluxio.underfs.hdfs.configuration=/etc/hadoop/conf/core-site.xml:/etc/hadoop/conf/hdfs-site.xml alluxio.security.authentication.type=SIMPLE alluxio.user.ufs.block.read.location.policy=alluxio.client.block.policy.DeterministicHashPolicy alluxio.user.file.readtype.default=CACHE alluxio.user.file.writetype.default=THROUGH alluxio.user.file.passive.cache.enabled=false alluxio.user.file.metadata.sync.interval=20000 alluxio.user.file.create.ttl=86400 alluxio.user.file.replication.min=1

Spark 配置文件spark-defaults.conf spark.master yarn spark.driver.maxResultSize 4g spark.driver.memory 4g spark.driver.extraClassPath /opt/whale/spark-3.5.1-bin-hadoop3/jars/iceberg-spark-runtime-3.5_2.12-1.4.3.jar,/opt/whale/spark-3.5.1-bin-hadoop3/jars/alluxio-2.8.1-client.jar,/opt/whale/spark-3.5.1-bin-hadoop3/jars/msw-spark-listener-1.0-SNAPSHOT-jar-with-dependencies.jar spark.executor.extraClassPath /opt/whale/spark-3.5.1-bin-hadoop3/jars/iceberg-spark-runtime-3.5_2.12-1.4.3.jar,/opt/whale/spark-3.5.1-bin-hadoop3/jars/alluxio-2.8.1-client.jar,/opt/whale/spark-3.5.1-bin-hadoop3/jars/msw-spark-listener-1.0-SNAPSHOT-jar-with-dependencies.jar spark.driver.extraJavaOptions -Dalluxio.user.file.writetype.default=ASYNC_THROUGH spark.executor.extraJavaOptions -Dalluxio.user.file.writetype.default=ASYNC_THROUGH spark.yarn.jars hdfs://beluga/user/spark3.5/jars/.jar spark.sql.hive.convertMetastoreOrc true spark.sql.hive.metastore.jars /usr/bigtop/current/hive-client/lib/ spark.sql.hive.metastore.version 3.1.3

spark.sql.extensions org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.paimon.spark.extensions.PaimonSparkSessionExtensions,org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension

spark.sql.catalog.spark_catalog org.apache.iceberg.spark.SparkSessionCatalog spark.sql.catalog.spark_catalog.type hive spark.sql.catalog.landing org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.landing.type hadoop spark.sql.catalog.landing.warehouse alluxio://ebj@beluga/oss/landing spark.sql.catalog.assembly org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.assembly.type hadoop spark.sql.catalog.assembly.warehouse alluxio://ebj@beluga/oss/assembly spark.sql.catalog.trusted org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.trusted.type hadoop spark.sql.catalog.trusted.warehouse alluxio://ebj@beluga/oss/trusted spark.sql.catalog.exchange org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.exchange.type hadoop spark.sql.catalog.exchange.warehouse alluxio://ebj@beluga/oss/exchange spark.sql.catalog.paimon org.apache.paimon.spark.SparkCatalog spark.sql.catalog.paimon.warehouse hdfs://beluga/data/lakehouse spark.sql.catalog.landing_paimon org.apache.paimon.spark.SparkCatalog spark.sql.catalog.landing_paimon.warehouse alluxio://ebj@beluga/oss/landing

spark.dynamicAllocation.enabled true

false if prefer shuffle tracking than ESS

spark.shuffle.service.enabled true spark.dynamicAllocation.initialExecutors 2 spark.dynamicAllocation.minExecutors 2 spark.dynamicAllocation.maxExecutors 20 spark.dynamicAllocation.executorAllocationRatio 0.5 spark.dynamicAllocation.executorIdleTimeout 60s spark.dynamicAllocation.cachedExecutorIdleTimeout 30min spark.dynamicAllocation.shuffleTracking.enabled false spark.dynamicAllocation.shuffleTracking.timeout 30min spark.dynamicAllocation.schedulerBacklogTimeout 1s spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 1s spark.cleaner.periodicGC.interval 5min

spark.sql.adaptive.enabled true spark.sql.adaptive.forceApply false spark.sql.adaptive.logLevel info spark.sql.adaptive.advisoryPartitionSizeInBytes 256m spark.sql.adaptive.coalescePartitions.enabled true spark.sql.adaptive.coalescePartitions.minPartitionSize 1MB spark.sql.adaptive.coalescePartitions.initialPartitionNum 8192 spark.sql.adaptive.fetchShuffleBlocksInBatch true spark.sql.adaptive.localShuffleReader.enabled true spark.sql.adaptive.skewJoin.enabled true spark.sql.adaptive.skewJoin.skewedPartitionFactor 5 spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes 256m spark.sql.adaptive.nonEmptyPartitionRatioForBroadcastJoin 0.2 spark.sql.autoBroadcastJoinThreshold -1

spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider spark.history.ui.port 18080 spark.history.fs.logDirectory hdfs://beluga/spark-history3.5 Spark.history.fs.update.interval 10s spark.history.retainedApplications 200 spark.history.fs.cleaner.enabled true spark.history.fs.cleaner.interval 7d spark.history.fs.cleaner.maxAge 30d spark.eventLog.rolling.enabled true spark.eventLog.rolling.maxFileSize 256m spark.eventLog.enabled true spark.eventLog.dir hdfs://beluga/spark-history3.5 spark.yarn.historyServer.address bigdata-102.whale.com:18080

spark.sql.queryExecutionListeners org.msw.listener.SparkSqlLineageListener

To Reproduce

Expected behavior 权限正常,目录775,文件644.

Urgency Describe the impact and urgency of the bug.

Are you planning to fix it Please indicate if you are already working on a PR.

Additional context Add any other context about the problem here.

YichuanSun commented 4 months ago

alluxio.worker.data.folder.permissions="rw-r-xr--"

try to add this to you alluxio-site.properties, then can you see all the mod be "654" instead of "777"?

haoranchuixue commented 4 months ago

alluxio.worker.data.folder.permissions="rw-r-xr--"

try to add this to you alluxio-site.properties, then can you see all the mod be "654" instead of "777"?

Tanks!! 设置 alluxio.worker.data.folder.permissions="rw-r-xr--" 后spark任务只要执行依然还是777。

但发现一个现象(默认目录属主都是alluxio,权限为777且PIN为YES),设置 alluxio.worker.data.folder.permissions="rw-r-xr--" 并且使用chown和chmod分别修改旧目录的属主和权限。则pin为no。 目前新建(通过spark sql创建iceberg catalog里的表)出来的目录和文件还是777。 2

YichuanSun commented 4 months ago

According to your test results, I don't think Alluxio causes the issue. Possibly due to Spark or Iceberg? I'm not sure, but I will ask other engineers and give you a feedback soon.