aws / aws-cdk

The AWS Cloud Development Kit is a framework for defining cloud infrastructure in code
https://aws.amazon.com/cdk
Apache License 2.0
11.51k stars 3.86k forks source link

(pipelines): using the same repo more than once as a connection in a pipeline causes duplicate id error #23916

Closed andreprawira closed 11 months ago

andreprawira commented 1 year ago

Describe the bug

I'm trying to check out two different branches of the same repo, but are getting the "Node with duplicate id" error message. Below is the code that I have

Expected Behavior

I'm expecting CDK will checkout the code from the same repo except it will use different branch and synth the code as usual

Current Behavior

jsii.errors.JavaScriptError: 
  @jsii/kernel.RuntimeError: Error: Node with duplicate id: questek/icmd-platform-test
      at Kernel._ensureSync (C:\Users\andre\AppData\Local\Temp\tmpt9ddaen9\lib\program.js:8428:27)
      at Kernel.invoke (C:\Users\andre\AppData\Local\Temp\tmpt9ddaen9\lib\program.js:7840:34)
      at KernelHost.processRequest (C:\Users\andre\AppData\Local\Temp\tmpt9ddaen9\lib\program.js:11017:36)
      at KernelHost.run (C:\Users\andre\AppData\Local\Temp\tmpt9ddaen9\lib\program.js:10977:22)
      at Immediate._onImmediate (C:\Users\andre\AppData\Local\Temp\tmpt9ddaen9\lib\program.js:10978:46)
      at processImmediate (node:internal/timers:466:21)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\app.py", line 26, in <module>
    app.synth()
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\.venv\lib\site-packages\aws_cdk\__init__.py", line 20667, in synth
    return typing.cast(_CloudAssembly_c693643e, jsii.invoke(self, "synth", [options]))
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\.venv\lib\site-packages\jsii\_kernel\__init__.py", line 148, in wrapped
    return _recursize_dereference(kernel, fn(kernel, *args, **kwargs))
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\.venv\lib\site-packages\jsii\_kernel\__init__.py", line 386, in invoke
    response = self.provider.invoke(
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\.venv\lib\site-packages\jsii\_kernel\providers\process.py", line 365, in invoke
    return self._process.send(request, InvokeResponse)
  File "D:\Work\AllCloud\QuesTek\questek-saas-cdk\.venv\lib\site-packages\jsii\_kernel\providers\process.py", line 331, in send
    raise RuntimeError(resp.error) from JavaScriptError(resp.stack)
RuntimeError: Error: Node with duplicate id: my-organization/my-branch

Reproduction Steps

additional_inputs={
                        "source-1": pipelines.CodePipelineSource.connection(
                            repo_string="my-organization/my-repo1",
                            branch=f"pipeline/{version}/frontend",
                            connection_arn=infra.code_star_connection,
                            code_build_clone_output=True,
                            trigger_on_push=True,
                        ),
                        "source-2": pipelines.CodePipelineSource.connection(
                            repo_string="my-organization/my-repo1", # this is supposed to be organization-name/repo-name but CDK is reading it as Node ID thus it needs to be different than above 
                            branch=f"pipeline/{version}/backend",
                            connection_arn=infra.code_star_connection,
                            code_build_clone_output=True,
                            trigger_on_push=True,
                        ),
                        "source-3": pipelines.CodePipelineSource.connection(
                            repo_string="my-organization/my-repo3",
                            branch="main",
                            connection_arn=infra.code_star_connection,
                            code_build_clone_output=True,
                            trigger_on_push=False,
                        ),

then run cdk deploy and you will then see the error

Possible Solution

No response

Additional Information/Context

If you try to change the source-2 repo name from my-repo1 to my-repo2 for example it would work. I think there is a bug here where CDK is reading that line as Node ID instead of repo name

CDK CLI Version

2.51.0 (build a87259f)

Framework Version

No response

Node.js Version

16.18.0

OS

Windows

Language

Python

Language Version

No response

Other information

No response

jbcursol commented 1 year ago

I believe I'm having the same issue when trying to do exactly what OP is. Except my issue comes from executing cdk synth with a .ts pipeline definition.

synth: new ShellStep('Synth', {
                input: CodePipelineSource.codeCommit(repos, 'main'),
                commands: [
                    'npm ci',
                    'npm run build',
                    'npx cdk synth main-TestFunctions-PipelineStack'
                ],
                additionalInputs: {
                    "../release": CodePipelineSource.codeCommit(repos, 'release')
                }
            }),
peterwoodworth commented 1 year ago

I've looked into this a bit, and it makes sense that this is happening. Thanks for reporting this!

CodePipelineSource.connection() will end up creating a new CodeStarConnectionSource using the data you've passed in. It will call the super class Step with the repo string.

https://github.com/aws/aws-cdk/blob/1d7aff583f2ef9e060204c635d3054d868084f65/packages/%40aws-cdk/pipelines/lib/codepipeline/codepipeline-source.ts#L412-L413

Step uses whatever was passed in as the identifier, so the id will be identical if the same repository was used twice despite the branch being different

https://github.com/aws/aws-cdk/blob/1d7aff583f2ef9e060204c635d3054d868084f65/packages/%40aws-cdk/pipelines/lib/blueprint/step.ts#L46-L48

We might want to take the branch name into account here (might be a breaking change), or, we might want to find some way to identify this scenario and adjust the id if so.

andreprawira commented 1 year ago

@peterwoodworth thanks for replying, i appreciate it, i'd suggest to have the branch name not to be used as the logical name, cause i think that is what's happening here (and please correct me if im wrong, i just started CDK not long ago), here is my suggestion

"source-1": pipelines.CodePipelineSource.connection(self, "logical-name-for-source-1",
                            repo_string="my-organization/my-repo1",
                            branch=f"pipeline/{version}/frontend",
                            connection_arn=infra.code_star_connection,
                            code_build_clone_output=True,
                            trigger_on_push=True,
                        ),
"source-2": pipelines.CodePipelineSource.connection(self, "logical-name-for-source-2",
                            repo_string="my-organization/my-repo1",
                            branch=f"pipeline/{version}/backend",
                            connection_arn=infra.code_star_connection,
                            code_build_clone_output=True,
                            trigger_on_push=True,
                        ),

That way it would allow checking out the same repo with multiple branches, also out of curiosity are you working for AWS?

npvisual commented 1 year ago

I believe there's an additional parameter, action_name, which has the following description :

The action name used for this source in the CodePipeline. Default: - The repository string

which should probably be used for the id if provided in the constructor. Unfortunately only the repo string is passed.

There was a similar comment provided here : https://github.com/aws/aws-cdk/pull/24767#pullrequestreview-1361843464

This is problematic, as several folks have mentioned, because you can't create a CodePipeline source for different branches of the same repo.

clafollett commented 1 year ago

I just hit this issue after a major refactor to upgrade all our packages and to finally get our deployments back up to speed. We were using GitHub Version 1 and upgraded to the CodeStarConnections and we can not longer deploy our environment pipelines that relied on a strict branch naming convention. Any idea when this fix will be available?

github-actions[bot] commented 11 months ago

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.