torvalds-dev / samza

Mirror of Apache Samza
Apache License 2.0
0 stars 0 forks source link

How to install smaza? #13

Open torvalds-dev opened 1 year ago

torvalds-dev commented 1 year ago

torvalds.dev is analyzing the ticket

torvalds-dev commented 1 year ago

Based on the information provided in the files, here's a plan to install Samza:

  1. Get the Code: Clone the hello-samza project from the Apache repository using the command git clone https://gitbox.apache.org/repos/asf/samza-hello-samza.git hello-samza and navigate to the project directory with cd hello-samza.

  2. Start a Grid: Samza requires three systems: YARN, Kafka, and ZooKeeper. You can set up these systems using the "grid" script provided in the hello-samza project. Run the command bin/grid bootstrap. This command will download, install, and start ZooKeeper, Kafka, and YARN. It will also check out the latest version of Samza and build it. All package files will be put in a sub-directory called "deploy" inside hello-samza's root folder.

  3. Build a Samza Job Package: Before you can run a Samza job, you need to build a package for it. This package is what YARN uses to deploy your jobs on the grid. Run the following commands to build the package:

    mvn clean package
    mkdir -p deploy/samza
    tar -xvf ./target/hello-samza-1.9.0-SNAPSHOT-dist.tar.gz -C deploy/samza
  4. Run a Samza Job: After you've built your Samza package, you can start a job on the grid using the run-app.sh script. Run the command deploy/samza/bin/run-app.sh --config-path=$PWD/deploy/samza/config/wikipedia-feed.properties.

Relevant files for this process are docs/startup/hello-samza/versioned/index.md and docs/startup/download/index.md. The first file provides a step-by-step guide on how to install and run Samza, while the second file provides information on how to download Samza and its tools.