Booz Allen's lean manufacturing approach for holistically designing, developing and fielding AI solutions across the engineering lifecycle from data processing to model building, tuning, and training to secure operational deployment
source link
BUG: The spark operator doesnt have write access for certain apigroups #273
The sparkoperator rbac.yaml chart was updated for downstream projects. For the api group,, the rule is set to only get resources. The pipeline-invocation-service, which uses the sparkoperator, requires create access to create a pipeline. This issue causes the following error when attempting to create a pipeline via the pipeline-invocation service: is forbidden: User "system:serviceaccount:default:sparkoperator" cannot create resource "customresourcedefinitions" in API group "" at the cluster scope
Definition of Done
Update the rbac.yaml chart to give the service account, spark-operator, the rules get, create, update, and delete for the api group
The spark-operator is dependent on aissemble-spark-operator-chart --> This is the rbac.yaml that needs to be updated
Use the previous custom chart that was being used --> Currently we are using the communities spark operator chart that doesnt have the correct access
Successfully build aissemble
Run integration test
Run archetype test
Create a new downstream project
Build the downstream project and complete the manual actions
Look inside of the downstream projects /src/main/resources/apps/spark-operator/charts and check to see if the rbac.yaml file was updated inside the zip file
Test the pipeline invocation service
Testing Steps
Clone aissemble or navigate to your aissemble directory
git clone
Build aissemble
mvn clean install
Create a downstream project using the below command:
- Run `mvn clean install` and resolve all manual steps
## Verify the rbac.yaml contents in the zip file
- Check the **rbac.yaml** file under <project-name>**/src/main/resources/apps/spark-operator/charts**
- **Note**: This is a .tgz file and some IDEs auto view this file. If yours doesnt, use the following command to unzip the file:
- `tar -xvzf aissemble-spark-operator-chart-1.9.0-SNAPSHOT.tgz `
- In the unzipped directory, **aissemble-spark-operator-chart/charts/spark-operator/templates**, look for the **rbac.yaml** file. Ensure that you see the **get**, **create**, **delete**, and **update** under the
**** apigroup
## Test the pipeline-invocation service
- Run the tilt command and wait for the resources to be set up: `tilt up`
- Once the resources are complete, run a health check and trigger the pipeline-invocation-service:
- **Health Check**
- The service can be accessed through an HTTP GET request
- Using Postman:
- Set the URL to http://localhost:8085/invoke-pipeline/healthcheck
- Set the Action to **GET**
- You should see: **Service Available** in the bottom section of Postman
- **Trigger the pipeline-invocation-service**
-The service can be accessed through an HTTP POSTrequest
- Using Postman:
- Set the URL to http://localhost:8085/invoke-pipeline/start-spark-operator-job
- Set the Action to **POST**
- Set the body type to JSON
- Set the body content to: `{"applicationName": "simple-data-delivery-example", "profile": "dev"}`
- To confirm that the `simple-data-delivery-example` was installed successfully, you should see the following log when you look at the **pipeline-invocation-service** pod logs:
NAME: simple-data-delivery-example
NAMESPACE: default
STATUS: deployed
The sparkoperator rbac.yaml chart was updated for downstream projects. For the api group,, the rule is set to only get resources. The pipeline-invocation-service, which uses the sparkoperator, requires create access to create a pipeline. This issue causes the following error when attempting to create a pipeline via the pipeline-invocation service:
Definition of Done
Testing Steps
git clone
mvn clean install
mvn archetype:generate \ -DarchetypeGroupId=com.boozallen.aissemble \ -DarchetypeArtifactId=foundation-archetype \ -DarchetypeVersion=1.9.0-SNAPSHOT \ -DgroupId=com.issue273 \ -DartifactId=issue273 \ -DprojectGitUrl=url \ -DprojectName=issue273
{ "name":"SimpleDataDeliveryExample", "package":"com.boozallen.aissemble.documentation", "type":{ "name":"data-flow", "implementation":"data-delivery-spark" }, "steps":[ { "name":"IngestData", "type":"synchronous", "dataProfiling":{ "enabled":false } } ] }
NAME: simple-data-delivery-example LAST DEPLOYED:
NAMESPACE: default
STATUS: deployed