Hydrospheredata / mist

Serverless proxy for Spark cluster
http://hydrosphere.io/mist/
Apache License 2.0
326 stars 68 forks source link

Docs - Multiple Classes in Jar, Custom Encoder, Package Class, Resubmit Conf, Debug and Absolute name of artifact & function. #468

Open gowravshekar opened 6 years ago

gowravshekar commented 6 years ago
dos65 commented 6 years ago

Thanks for questions, they will help us to improve our documentation. For a start, I try to answer here

dos65 commented 6 years ago

Also, we have gitter room for questions.

gowravshekar commented 6 years ago

@dos65, Thank you for the explanation. Really appreciate your time and consideration.

Package class works. The artifact wasn't refreshed when I added package to class. On restarting mist-master it worked.

I was not able to get the updating artifact work. Using mist-1.0.0-RC13

If I run mist-cli apply -f conf --validate true -u '', getting error - Artifact key xxx.jar has to be unique.

If I run mist-cli apply -f conf/correlation-matrix.conf --validate true -u '', getting error - Error: 400 Client Error: Bad Request for url: http://localhost:2004/v2/api/functions?force=False: class java.lang.IllegalStateException: Endpoint correlation-matrix already exists

With respect to debugging, I'm looking for a way to put breakpoint in code and debug. Similar to this.

blvp commented 6 years ago

Last error with function update was fixed in a new version of mist-cli, try to update it with following command:pip install mist-cli --upgrade

gowravshekar commented 6 years ago

After upgrading,

mist-cli apply -f conf/correlation-matrix.conf --validate true -u '' - Works.

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

dos65 commented 6 years ago

@gowravshekar About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

blvp commented 6 years ago

mist-cli apply -f conf --validate true -u '' - Getting same error message. Artifact key xxx.jar has to be unique

This is normal behavior because you can break all functions using that jar. If you want to update jar with enabled validation you should change version config value and then change it in function. Reasons behind this are the following - apply method used for both development and release and this limitation is kind of our vision of release process.

Some additional notes.

You can use environment variables to manage artifact version. For example: artifact.conf

model = Artifact
name = test-artifact
version = ${ARTIFACT_VERSION}
data.file-path = "./path/to/artifact.jar"

function.conf

model = 
data {
    ...
    path = test-artifact_${ARTIFACT_VERSION}.jar
    ...
}

and then ARTIFACT_VERSION=0.0.1 mist-cli apply -f conf/

dos65 commented 6 years ago

Oh, my mistake - --validate false instead of --validate true for unsafe update

apoorv22 commented 5 years ago

@gowravshekar About debugging - unfortunately, there is a bug with constructing spark-submit command (#472), so currently it's impossible to pass driver-java-options correctly, If you really need it you can implement manual runner and add into spark submit following argument --driver-java-options '-Xdebug -Xrunjdwp:transport=dt_socket,address=15000,server=y,suspend=y'

Does this bug still exist? Is there a way now to debug the spark job?

dos65 commented 5 years ago

@apoorv22 this one is fixed, you can use these options to debug spark job. Also, you need to be aware of the following things:

gowravshekar commented 5 years ago

@blvp, Is there a way to use an environmental variable or config to use in data.file-path in artifact.conf?

Some thing similar as below: data.file-path = "./path/to/artifact_${ARTIFACT_VERSION}.jar"

blvp commented 5 years ago

Yes, you can use environment variable here in a similar manner: data.file-path="simple-name"${VERSION}".jar"