Closed soumilshah1995 closed 7 months ago
@alberttwong did you get it to work with uneatable ?
still same issue @alberttwong
export AWS_ACCESS_KEY_ID=admin
export AWS_SECRET_ACCESS_KEY=password
export S3_ENDPOINT=http://localhost:9000
export AWS_ENDPOINT_URL_S3=http://localhost:9000
export AWS_ENDPOINT=http://localhost:9000
java \
-jar /Users/soumilshah/IdeaProjects/SparkProject/MyGIt/StarRocks-Hudi-Minio/jar/utilities-0.1.0-beta1-bundled.jar \
--datasetConfig ./config.yml
@sagarlakshmipathy is there something we're missing? I feel like the java app doesn't pick up ENV.
[root@spark-hudi auxjars]# cat ~/.aws/config
[default]
region = us-west-2
output = json
[services testing-s3]
s3 =
endpoint_url = http://minio:9000
[root@spark-hudi auxjars]# cat ~/.aws/credentials
[default]
aws_access_key_id = admin
aws_secret_access_key = password
[root@spark-hudi auxjars]# aws s3 ls --endpoint http://minio:9000
2024-02-22 21:23:53 huditest
2024-02-22 21:18:19 warehouse
[root@spark-hudi auxjars]# env|grep AWS
AWS_IGNORE_CONFIGURED_ENDPOINT_URLS=true
AWS_REGION=us-east-1
AWS_ENDPOINT_URL_S3=http://minio:9000
okay. I figured it out. You need to modify utilities/src/main/resources/onetable-hadoop-defaults.xml to include additional configs. https://github.com/onetable-io/onetable/pull/337 We need to clean this up so that onetable scans for conf files.
also we need to modify the trino schema create command.
See https://github.com/StarRocks/demo/issues/54 for all the instructions.
im lost lol which variables do we need to set ?
I tried
export AWS_ACCESS_KEY_ID=admin
export AWS_SECRET_ACCESS_KEY=password
export S3_ENDPOINT=http://localhost:9000
export AWS_ENDPOINT_URL_S3=http://localhost:9000
export AWS_ENDPOINT=http://localhost:9000
are this all variables I need to set ?
That's the issue... none of them worked so I'm just listing all the variants I tried. what worked was only for my own demo and I modified my hadoop settings. I couldn't get it working for onetable's docker demo.
understood let me post the ticket in hudi channel maybe someone can help there
@the-other-tim-brown is this something you can help here?
@soumilshah1995 @alberttwong I'm on vacation until Thursday, I can help replicate this issue once I'm back. @the-other-tim-brown feel free to chime in when you get a chance.
@sagarlakshmipathy Thanks a lot. Please enjoy your vacations this is not urgent or blocking its mostly for POC no Hurry at all
I have not used MinIO before. I will have to spend some time coming up to speed on it.
@soumilshah1995 have you tried updating the hadoop config like Albert suggested? He put up the configs he used here: https://github.com/apache/incubator-xtable/pull/337/files
I haven't opted for using a container for my Hadoop setup. Could you kindly suggest the steps to set it up on a Mac?"
Soumil... I think if slightly change your java run command to java -jar utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig onetable.yaml -p ../conf/core-site.xml
, it should work. The raw for the xml can be found at https://github.com/StarRocks/demo/blob/master/documentation-samples/datalakehouse/conf/core-site.xml
I think I will take a different route rather using deltastreamer and xtable with MINIO and StarRocks I need to try that will try it and keep you all posted here
I have decided to go this route instead
here is my architecture
Hello there I have MINIO bucket locally huditest
I am trying to use onetable with MINIO
docker-compose file
hudi_job.py
config.yml
Error
How to configure one table to work with MINO buckets ? I tried s3 and s3a both how would I setup to work with minio buckets