USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
http://irds.usc.edu/sparkler/
Apache License 2.0
412 stars 143 forks source link

[Docker] Update run script with relative paths and docker-compose file #223

Closed nhandyal closed 3 years ago

nhandyal commented 3 years ago

What changes were proposed in this pull request?

  1. Update sparkler.sh to use relative paths and absolute paths to resolve sparkler jar directory
  2. Update elasticsearch docker-compose.yml to specify bind volume using relative paths

'Closes #219 '.

How was this patch tested?

For changes to sparkler.sh

  1. ran sparkler-core/sparkler-deployment/docker/elasticsearch/dockler.py --up. Logged into sparkler-elastic container. Ran sparkler to verify sparkler can be started when symlinked to /usr/bin. @buggtb to confirm if this fix resolves issue where sparkler.sh and jar are moved to different locations on disk

For changes to docker-compose.yml

  1. ran docker inspect sparkler-elastic. Verified the bind volume path corresponded to the sparkler-core directory
    docker inspect sparkler-elastic
    "Mounts": [
    {
        "Type": "bind",
        "Source": "/Users/handyal/source/sparkler/sparkler-core",
        "Destination": "/data/sparkler-core",
        "Mode": "",
        "RW": true,
        "Propagation": "rprivate"
    }
    ],
lewismc commented 3 years ago

@buggtb can you please check. Gracias :)

lewismc commented 3 years ago

@buggtb please check. I'll merge by CoB today unless there are objections.

buggtb commented 3 years ago

Don't worry about this one, it didn't resolve the issue and we have a fix on the mvn2sbt branch that should sort it all out once we've got that branch ready for merging.