Open coff33Overflow opened 1 year ago
@coff33Overflow what are the dependencies of your JAVA API?
@LuQQiu
I don't get it; may you please check the pom.xml of project I shared for all dependencies required for this project in description.
@LuQQiu What more details you need?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.
Alluxio Version: 2.8.1
Describe the bug
DistributeLoadCommand Java API giving bad operand type error due to incompatible grpc channels.
To Reproduce Integrate HDFS with alluxio in local cluster.
Sharing the
alluxio-site.properties
and java code script, which tries to load the mounted UFS (HDFS) data into alluxio memory usingDistributedLoadCommand
is throwing error due to grpc channel compatibility issue. FSOperations.zip alluxio-site.propertiesExpected behavior Mounted HDFS data should have been loaded into alluxio memory via this java api as we are using global configuration which is getting picked up from
alluxio-site.properties
.CLI command
distributedLoad
is working fine whereas Java api is not working fine. SHaring the screenshot for your reference.Urgency This bug is acting as major blocker in order to load the large amount of data from any external UFS into alluxio. It can be done through CLI but we are building something which requires connecting to remote alluxio cluster. So instead of doing ssh into remote alluxio master node and using CLI command we found alluxio java api is more friendly way to do operations on remote alluxio file systems.
Are you planning to fix it Please indicate if you are already working on a PR.
Additional context
Also, we tried integrating alluxio with spark which is able to load the data from ext UFS into alluxio but spark has limitation of not keeping the file name same when dealing with parquet files.
For more info read this slack thread: https://alluxio-community.slack.com/archives/C03RDNW962C/p1675157956837039