boozallen / aissemble

Booz Allen's lean manufacturing approach for holistically designing, developing and fielding AI solutions across the engineering lifecycle from data processing to model building, tuning, and training to secure operational deployment
Other
34 stars 8 forks source link

TASK: Cleanup the legacy namespace for lineage event #449

Closed csun-cpointe closed 10 hours ago

csun-cpointe commented 1 day ago

Description

In v1.7.0 we have released the OpenLineage Namesapce Conventions to better follow OpenLineage's guidelines. Moving forward, namespaces should be defined in the data-lineage.properties file. We are cleaning up the data.lineage.namespace properties in a project's data-lineage.properties file, which was supported as a fallback but will no longer be supported in release 1.10

DOD

Acceptance criteria required to complete the work

Test Strategy/Script

How will this item be verified?

  1. Using create a new aissemble-based project using the latest archetype snapshot.

    mvn archetype:generate '-DarchetypeGroupId=com.boozallen.aissemble' \
                           '-DarchetypeArtifactId=foundation-archetype' \
                           '-DarchetypeVersion=1.10.0-SNAPSHOT' \
                           '-DgroupId=org.test' \
                           '-Dpackage=org.test' \
                           '-DprojectGitUrl=test.org/test.git' \
                           '-DprojectName=Test pyspark lineage' \
                           '-DartifactId=test-449' \
    && cd test-449
  2. Set your Java version to 17 if it is not currently

  3. Under -model/src/main/resources/pipelines add below pipeline models SparkPipeline.json, PythonPipeline.json, and ClassificationTraining.json

  4. Fully generate the project by running mvn clean install and following manual actions

  5. Build the project without the cache and follow the last manual action.

    mvn clean install -Dmaven.build.cache.skipCache
  6. Deploy the project and wait for all services ready

    tilt up; tilt down
  7. Manually trigger the python-pipeline pod and verify no errors in the log

  8. Manually trigger the spark-pipeline pod and verify no errors in the log

  9. Use postman or any rest client to trigger the training pipeline and verify a successful training pipeline id responded

References/Additional Context

As needed

carter-cundiff commented 10 hours ago

Testing passed: image image image