Consider adding the Spark Kubernetes dependency

chgl commented 1 year ago

I'd like to be able to launch Spark jobs from the Pathling server using its native Kubernetes support: https://spark.apache.org/docs/latest/running-on-kubernetes.html

Setting spark.master to k8s://kubernetes:443 almost works, but currently fails due to a missing dependency: https://mvnrepository.com/artifact/org.apache.spark/spark-kubernetes_2.13/3.4.1.

The not-so-helpful Spark error is Could not parse Master URL: 'k8s://kubernetes:443' but it's actually caused by the missing dependency: https://stackoverflow.com/a/59532027.

I'd be happy to create a PR!

johngrimes commented 1 year ago

@chgl This is something that we have been wanting to add for a while, and we would be happy to collaborate on a PR!

I assume that you are looking to add clustering capability to Pathling Server?

We have done a bit of exploration of this previously - the main challenge lies in modifying the Docker image to make it compatible with being the same image that is used for creating worker pods.

johngrimes commented 1 year ago

There is an existing issue: #369.

I've now closed it as a duplicate of this.

chgl commented 1 year ago

Ah, didn't see the other issue, sorry about the duplicate! But yes, I think there is two parts to clustering: one is using Kubernetes as an orchestrator for running the Spark jobs, the other is running multiple instances of the pathling server itself. The latter might work already, but maybe ingesting/importing FHIR resources from multiple instances to the same shared warehouse could be an issue.

Regarding the former: I've just pushed an early-draft Helm chart (based on https://github.com/aehrc/pathling/pull/1293 with minio as a storage backend and some more practices like setting more restrictive security contexts, auto-generated README, ability to configure RBAC for the service account, etc.): https://github.com/chgl/charts/tree/master/charts/pathling-server.

I've gotten as far as the server launching spark executor pods using the default spark image docker.io/apache/spark:v3.3.2 for now (https://github.com/chgl/charts/blob/master/charts/pathling-server/values-test.yaml#L38). Although that obviously doesn't fully work yet. I ran into some "UnknownHostException" with the executor pods trying to connect to the hostname of the pathling server pod even before running into any missing dependency issues.

We have done a bit of exploration of this previously - the main challenge lies in modifying the Docker image to make it compatible with being the same image that is used for creating worker pods.

To me Spark is quite a bit of trial and error especially when it comes to dependencies, but what could work when sticking with Jib is building a new image using one of the official spark images as a base, setting the containerizingMode to packaged so we get the plain jars that should work with spark (?) (https://github.com/GoogleContainerTools/jib/tree/master/jib-maven-plugin#extended-usage), and maybe setting some environment variables so spark is aware of the path. I'll have to do a bit of exploring as well, but at least adding the spark-kubernetes dependency is a good first step.

johngrimes commented 1 year ago

We have a Helm chart that we have been meaning to open source, but haven't quite got around to it. Perhaps you could open a pull request containing just the Helm chart - this could be a valuable addition in its own right, regardless of K8S clustering.

The reason that we don't use the apache/spark Docker image as the base image is that it is based on Debian and is quite out of date and slow to respond to vulnerabilities. For example, I just scanned it for critical vulnerabilities and got this:

apache/spark:v3.3.2 (debian 11.4)

Total: 6 (CRITICAL: 6)

┌──────────────┬────────────────┬──────────┬─────────────────────────┬─────────────────────────┬──────────────────────────────────────────────────────────┐
│   Library    │ Vulnerability  │ Severity │    Installed Version    │      Fixed Version      │                          Title                           │
├──────────────┼────────────────┼──────────┼─────────────────────────┼─────────────────────────┼──────────────────────────────────────────────────────────┤
│ libdb5.3     │ CVE-2019-8457  │ CRITICAL │ 5.3.28+dfsg1-0.8        │                         │ sqlite: heap out-of-bound read in function rtreenode()   │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2019-8457                │
├──────────────┼────────────────┤          ├─────────────────────────┼─────────────────────────┼──────────────────────────────────────────────────────────┤
│ libjson-c5   │ CVE-2021-32292 │          │ 0.15-2                  │ 0.15-2+deb11u1          │ stack-buffer-overflow in parseit() in json_parse.c       │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2021-32292               │
├──────────────┼────────────────┤          ├─────────────────────────┼─────────────────────────┼──────────────────────────────────────────────────────────┤
│ libpcre2-8-0 │ CVE-2022-1586  │          │ 10.36-2                 │ 10.36-2+deb11u1         │ Out-of-bounds read in compile_xclass_matchingpath in     │
│              │                │          │                         │                         │ pcre2_jit_compile.c                                      │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2022-1586                │
│              ├────────────────┤          │                         │                         ├──────────────────────────────────────────────────────────┤
│              │ CVE-2022-1587  │          │                         │                         │ Out-of-bounds read in get_recurse_data_length in         │
│              │                │          │                         │                         │ pcre2_jit_compile.c                                      │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2022-1587                │
├──────────────┼────────────────┤          ├─────────────────────────┼─────────────────────────┼──────────────────────────────────────────────────────────┤
│ libtasn1-6   │ CVE-2021-46848 │          │ 4.16.0-2                │ 4.16.0-2+deb11u1        │ libtasn1: Out-of-bound access in ETYPE_OK                │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2021-46848               │
├──────────────┼────────────────┤          ├─────────────────────────┼─────────────────────────┼──────────────────────────────────────────────────────────┤
│ zlib1g       │ CVE-2022-37434 │          │ 1:1.2.11.dfsg-2+deb11u1 │ 1:1.2.11.dfsg-2+deb11u2 │ heap-based buffer over-read and overflow in inflate() in │
│              │                │          │                         │                         │ inflate.c via a large...                                 │
│              │                │          │                         │                         │ https://avd.aquasec.com/nvd/cve-2022-37434               │
└──────────────┴────────────────┴──────────┴─────────────────────────┴─────────────────────────┴──────────────────────────────────────────────────────────┘
2023-09-04T10:03:49.137+1000    INFO    Table result includes only package filenames. Use '--format json' option to get the full path to the package file.

Java (jar)

Total: 5 (CRITICAL: 5)

┌──────────────────────────────────────────────────────────────┬────────────────┬──────────┬───────────────────┬──────────────────────┬───────────────────────────────────────────────────────────┐
│                           Library                            │ Vulnerability  │ Severity │ Installed Version │    Fixed Version     │                           Title                           │
├──────────────────────────────────────────────────────────────┼────────────────┼──────────┼───────────────────┼──────────────────────┼───────────────────────────────────────────────────────────┤
│ org.apache.hadoop:hadoop-common                              │ CVE-2022-25168 │ CRITICAL │ 3.3.2             │ 2.10.2, 3.2.4, 3.3.3 │ Command injection in                                      │
│ (hadoop-client-api-3.3.2.jar)                                │                │          │                   │                      │ org.apache.hadoop.fs.FileUtil.unTarUsingTar               │
│                                                              │                │          │                   │                      │ https://avd.aquasec.com/nvd/cve-2022-25168                │
│                                                              ├────────────────┤          │                   ├──────────────────────┼───────────────────────────────────────────────────────────┤
│                                                              │ CVE-2022-26612 │          │                   │ 3.2.3, 3.3.3         │ Arbitrary file write in FileUtil#unpackEntries on Windows │
│                                                              │                │          │                   │                      │ https://avd.aquasec.com/nvd/cve-2022-26612                │
├──────────────────────────────────────────────────────────────┼────────────────┤          │                   ├──────────────────────┼───────────────────────────────────────────────────────────┤
│ org.apache.spark:spark-core_2.12 (spark-core_2.12-3.3.2.jar) │ CVE-2023-22946 │          │                   │ 3.4.0                │ Apache Spark vulnerable to Improper Privilege Management  │
│                                                              │                │          │                   │                      │ https://avd.aquasec.com/nvd/cve-2023-22946                │
├──────────────────────────────────────────────────────────────┼────────────────┤          ├───────────────────┼──────────────────────┼───────────────────────────────────────────────────────────┤
│ org.codehaus.jackson:jackson-mapper-asl                      │ CVE-2019-10202 │          │ 1.9.13            │                      │ codehaus: incomplete fix for unsafe deserialization in    │
│ (jackson-mapper-asl-1.9.13.jar)                              │                │          │                   │                      │ jackson-databind vulnerabilities                          │
│                                                              │                │          │                   │                      │ https://avd.aquasec.com/nvd/cve-2019-10202                │
├──────────────────────────────────────────────────────────────┼────────────────┤          ├───────────────────┼──────────────────────┼───────────────────────────────────────────────────────────┤
│ org.yaml:snakeyaml (snakeyaml-1.31.jar)                      │ CVE-2022-1471  │          │ 1.31              │ 2.0                  │ Constructor Deserialization Remote Code Execution         │
│                                                              │                │          │                   │                      │ https://avd.aquasec.com/nvd/cve-2022-1471                 │
└──────────────────────────────────────────────────────────────┴────────────────┴──────────┴───────────────────┴──────────────────────┴───────────────────────────────────────────────────────────┘

So one approach is to switch over to apache/spark as the base image, but another might be to change our current image to meet the new requirements of being a K8S worker. I'd like to at least investigate the latter.

chgl commented 1 year ago

Perhaps you could open a pull request containing just the Helm chart - this could be a valuable addition in its own right, regardless of K8S clustering.

Sure! There might be a bit to talk around where to host the chart, versioning, releasing, linting, etc. but that PR is probably a better place for that.

The reason that we don't use the apache/spark Docker image as the base image is that it is based on Debian and is quite out of date and slow to respond to vulnerabilities. For example, I just scanned it for critical vulnerabilities and got this:

That's fair. The most recent apache/spark:3.4.1 at least "only" has 2 critical CVEs. The bitnami/spark:3.4.1 images aren't bad either, but given the size of both I'm wondering if it makes more sense to keep the server image as-is (except for adding spark k8s dependencies) and create an additional image to be used as Spark executor and (?) driver image.

Besides directly submitting the jobs to the Kubernetes API, there's also https://github.com/bitnami/charts/tree/main/bitnami/spark which is helpful with its support for some convenience features like the ingress configuration for the Spark dashboard.

johngrimes commented 11 months ago

Resolved in 6.4.0.

johngrimes commented 11 months ago

We've also added a Helm chart - @chgl check it out and let us know what you think.

johngrimes commented 11 months ago

I've added a page to the documentation with an example of how to use the Kubernetes cluster manager with Pathling server.

chgl commented 11 months ago

@johngrimes Again, awesome and works great! I've integrated everything in my chart variant as well: https://github.com/chgl/charts/pull/425. It does add some extra features like ingress support, restricted security context, servicemonitor integration. We could attempt to merge them somehow if that would make sense on your end.

aehrc / pathling

Consider adding the Spark Kubernetes dependency #1672