MichaelMt66 / open-source-lakehouse

3 stars 2 forks source link

load_sources.sh does not exist in spark worker container #9

Open alberttwong opened 5 months ago

alberttwong commented 5 months ago
atwong@Albert-CelerData data % docker exec -ti spark-worker-a /bin/bash
root@804b5ae4316e:/spark-app# bash /opt/spark-apps/load_sources.sh
bash: /opt/spark-apps/load_sources.sh: No such file or directory
alberttwong commented 5 months ago
root@9c75b95829cf:/# find . -name *.sh
./etc/rc0.d/K01hwclock.sh
./etc/init.d/hwclock.sh
./etc/rcS.d/S01hwclock.sh
./etc/rc6.d/K01hwclock.sh
./opt/spark/bin/docker-image-tool.sh
./opt/spark/bin/load-spark-env.sh
./opt/spark/sbin/spark-daemon.sh
./opt/spark/sbin/start-worker.sh
./opt/spark/sbin/stop-worker.sh
./opt/spark/sbin/workers.sh
./opt/spark/sbin/stop-thriftserver.sh
./opt/spark/sbin/stop-slave.sh
./opt/spark/sbin/decommission-slave.sh
./opt/spark/sbin/slaves.sh
./opt/spark/sbin/decommission-worker.sh
./opt/spark/sbin/start-all.sh
./opt/spark/sbin/stop-workers.sh
./opt/spark/sbin/start-history-server.sh
./opt/spark/sbin/spark-config.sh
./opt/spark/sbin/start-slave.sh
./opt/spark/sbin/stop-all.sh
./opt/spark/sbin/start-mesos-dispatcher.sh
./opt/spark/sbin/stop-history-server.sh
./opt/spark/sbin/start-slaves.sh
./opt/spark/sbin/start-workers.sh
./opt/spark/sbin/stop-master.sh
./opt/spark/sbin/start-mesos-shuffle-service.sh
./opt/spark/sbin/start-thriftserver.sh
./opt/spark/sbin/start-master.sh
./opt/spark/sbin/stop-mesos-shuffle-service.sh
./opt/spark/sbin/spark-daemons.sh
./opt/spark/sbin/stop-slaves.sh
./opt/spark/sbin/stop-mesos-dispatcher.sh
./opt/spark/examples/src/main/scripts/getGpusResources.sh
./opt/spark/kubernetes/dockerfiles/spark/entrypoint.sh
./opt/spark/kubernetes/dockerfiles/spark/decom.sh
./lib/init/vars.sh
./usr/share/debconf/confmodule.sh
./usr/share/vim/vim81/macros/less.sh
./usr/share/PackageKit/helpers/test_spawn/search-name.sh
./spark-app/start-spark.sh
alberttwong commented 5 months ago

I had to create the file (https://github.com/MichaelMt66/open-source-lakehouse/blob/main/lakehouse/load_sources.sh) in the spark worker container and then run it.

alberttwong commented 5 months ago

also it takes almost 1 hour to process and create all the hudi files.