Open eosantigen opened 5 years ago
thanks for opening this @eosantigen
i can see that the link is good, are you able to grab the logs for the init container?
i'm curious if there is anything suspicious in there. downloadData
and mavenDependencies
shouldn't interfere with each other.
one more thing, would it be possible to share the manifest you used to spawn the spark cluster, i could try to repeat your process to see if i can also get this bug.
Sure, here is the manifest.
apiVersion: radanalytics.io/v1
kind: SparkCluster
metadata:
name: spark-cluster
namespace: sparkop
spec:
worker:
instances: '2'
resources:
limits:
memory: 4Gi
requests:
memory: 400Mi
master:
instances: '1'
resources:
limits:
memory: 4Gi
requests:
memory: 400Mi
sparkWebUI: "true"
mavenDependencies:
- org.apache.hadoop:hadoop-azure:2.7.1
- org.apache.hadoop:hadoop-common:2.7.1
- org.apache.hadoop:hadoop-client:2.7.1
- org.apache.hadoop:hadoop-auth:2.7.1
- org.apache.hadoop:hadoop-hdfs:2.7.1
env:
- name: HADOOP_CLASSPATH
value: /tmp/jars/*
- name: HADOOP_OPTIONAL_TOOLS
value: hadoop-azure
- name: HADOOP_CONF_DIR
value: /tmp/
downloadData:
- url: https://bitbucket.org/metiscybertech/configuration-templates/raw/dac5298e5a83cfaa9ac07dc50f56cd255130faa5/core-site.xml
to: /tmp/core-site.xml
The logs from the init container of name "downloader" are:
wget: note: TLS certificate validation not implemented
wget: TLS error from peer (alert code 40): handshake failure
wget: error getting response: Connection reset by peer
For now, I have found a workaround in passing the directives included in core-site.xml in another way .
thanks for sharing this, i'll see if i can replicate the issue.
For now, I have found a workaround in passing the directives included in core-site.xml in another way .
glad to hear you have a workaround =)
we should be adding --no-check-certificate
for the wget
and/or do not fail the whole cluster deployment if the wget fails (or at least make it configurable)
@jkremser that's what i was thinking after seeing the tls error, just hadn't confirmed yet.
i think adding a flag for insecure download is probably the best solution, there is something similar in the s2i tooling and i like the idea of making it explicit.
Description:
I went on to use this bit in my SparkCluster template. A spark cluster is deployed perfectly without it, however, when I use the following downloadData block the init containers all fail into a CrashLoopBackOff.
Followed the examples in the repository and the block is added at the final lines, exactly beginning on the same row as the worker, with indentation 2 spaces from spec: .
Maybe it's worth mentioning that I also have mavenDependencies as well, in the template.
Thanks.