data-dot-all / dataall

A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
https://data-dot-all.github.io/dataall/
Apache License 2.0
235 stars 82 forks source link

Deprecation of CodeCommit prevents creation of CDK pipelines #1448

Open jpdev42 opened 3 months ago

jpdev42 commented 3 months ago

Describe the bug

Since CodeCommit was deprecated (see here), accounts/organizations that do not have existing CodeCommit repositories can't create new ones. However, data.all expects a CodeCommit repository to be created for CDK pipelines. As a result, the Source stage of a newly created pipeline fails with a message like this:

The action failed because no AWS CodeCommit repository named <pipeline_id> was found. Make sure you are using the correct repository name, and then try again. Error: <pipeline_id> for <account_id>

How to Reproduce

Create a new pipeline in an account or organization that has no existing CodeCommit repositories.

Expected behavior

There should be an option to utilize another repository technology - e.g. GitHub, GitLab, etc.

Your project

No response

Screenshots

No response

OS

Mac

Python version

3.12.2

AWS data.all version

2.6.0

Additional context

No response

mourya-33 commented 3 months ago

@jpdev42 , Github can be linked as described in the documentation for deployments - "Using CodeStar Connection to GitHub, GitHub Enterprise, GitLab or Bitbucket:"

Reference: https://data-dot-all.github.io/dataall/deploy-aws/ https://docs.aws.amazon.com/dtconsole/latest/userguide/connections-create.html

are you referring to any other pipelines other than the deployment main pipeline?

cc: @noah-paige

jpdev42 commented 3 months ago

I am referring to the Pipelines functionality that is described in section "5.4 Pipelines" of the Data.all User Guide. Creating a new pipeline fails because CodeCommit repositories can no longer be created in accounts or Organizations that don't already have CodeCommit repositories.

When you use the Data.all frontend's "Create a pipeline" form, there is no option to utilize a CodeStar Connection.

noah-paige commented 3 months ago

Thanks for raising this issue @jpdev42 - you are correct given the latest status of CodeCommit existing data.all Pipelines should continue to operate normally but new Pipelines created in the data.all UI will fail (due to a required dependency on creating a new CodeCommit repository)

We plan to discuss with our team on how to proceed further on development / support for data.all pipelines. What is your current use case using data.all pipelines? We would appreciate any feedback on if there are strong thoughts on expected behavior for data.all pipelines moving forward

jpdev42 commented 3 months ago

Hi @noah-paige,

I would like to be able to provision environment/member accounts with templated (but customizable) CDK/DDK pipeline applications which deploy ETL pipelines (e.g. S3 > EventBridge > StepFunctions/Glue Visual ETL Jobs/Glue Notebook Jobs) across SDLC tiers.