Open jacobs-xero opened 1 month ago
Thanks for the great bug report @jacobs-xero
I'll setup a kotlin VCS project and try to reproduce it. Can I just clarify a couple of things:
/webhooks/index.html?projectId=blah
or from editing the configuration on the project?Thanks in advance.
If a webhook is not defined in code (eg for subprojects, because we have no way of defining that in code), then the webhook just becomes uneditable
Can I please get a bit more info on what you see when it becomes "uneditable"? I'm hoping you might get some errors in the browser console (developer tools).
Can you also confirm my assumption here?
Hi @jacobs-xero. I have not been able to reproduce this issue. Are you able to answer the questions I had posted above?
@netwolfuk sorry for the delay in getting back. I'll find the answers for the questions above and get back to you asap. Thanks
Thanks @jacobs-xero There is a similar conversation happening on issue #248 Can you answer these questions too please?
Allow editing project settings via UI
is turned on.plugin-settings.xml
file, do your webhooks have id
values set? eg, <webhook id="id_552375210" url=...
A project has a webhook configured, and has sub-projects enabled. Yes, correct.
A sub-project has VCS kotlin enabled and inherits webhooks from the parent project. These inherited webhooks are uneditable. It's the parent project that has VCS kotlin enabled (versioned settings) and sub projects inherit the webhooks from parent project. The parent project webhooks are uneditable.
Do you have TeamCity setup with two nodes? Or just a single node? We only have a single node
Are you allowing edits in TeamCity to be committed back to VCS? eg, Allow editing project settings via UI is turned on.
Allow editing project settings via UI
is turned on.
In the plugin-settings.xml
file, do your webhooks have id
values set? eg, <webhook id="id_552375210" url=...
The webhooks do have an id
within the plugin-settings.xml
file.
Thank you for the quick feedback. I'll add that to the array of facts I have been collecting. I am about to post an update to the other ticket (#248 ) as to what I have been able to reproduce.
Thanks @netwolfuk , is the advice to use id
in the webhook definition? I believe our webhooks contain this already, does it matter if it follows the id_552375210
convention or would something like you mentioned in the other ticket FooBarPRDevOps_Webhook_1
fair better? Thanks
HI @jacobs-xero. Having an ID is required. It does not matter which format you choose. The most important thing is that they don't clash across the whole of the teamcity instance.
If I understand correctly, your VCS is being written to by TeamCity. This means that when the webhook is loaded it is...
For ticket #248 the same is happening, but because they don't let TeamCity update back into VCS, the issue occurs on every update.
So, for them the benefit is that using an id like FooBarPRDevOps_Webhook_1
allows them to manage uniqueness better, since the only place that can be managed is in their XML file before they commit it.
At this stage, these two tickets look very similar, but I think are actually different bugs. I am hoping to get some time on the weekend to do a deep dive on the issue you are seeing.
I suspect in both cases it is exacerbated by the nested projects that inherit from a VCS managed project, but I have yet to prove that.
Thinking out loud, if...
Then the cache will stay in sync with the config.
The IDs will be constantly changing which does have two at least two downsides...
I think I will also look into creating a task that checks that webhooks have IDs in the file. I can put up one of those notification bars along the top of the window to indicate there is an issue and that one should fix the IDs in the config file.
Also, I've started investigating whether I can support webhook config inside the settings.kts file. If that's not too hard, then I can apply logic at the time that file is loaded and mandate that they need IDs and that they don't clash.
I've managed to make some really good progress on this. I have found the following:
Updates to webhooks that are driven by the plugin work well The cache is maintained correctly because the changes are applied to the settings, and then the cache is updated.
Updates driven by VCS changes without IDs cause ghost webhooks This is because the existing webhooks are not updated because they are not found due to the ID being regenerated.
Also, I was not comparing the result of the existing webhooks against the updated webhooks. The only predictable way to compare them is by ID, but newly loaded webhooks will be assigned an ID if they don't have one defined. This will not match the existing one, so the old webhook is not updated. A new one is created.
I have added logic to compare the IDs after the cache is refreshed, and remove any webhooks from the current project if they don't match one of the new IDs. A line like the following is printed in the log.
[2024-08-05 09:16:35,341] INFO - jetbrains.buildServer.SERVER - WebHookSettingsManagerImpl :: Found the following webhooks in the cache that no-longer map to webhooks in project 'VcsProjectTest'. WebHooks with ids '[id_377052579, id_283758663]' will be removed from the cache.
This does have the downside that IDs will churn. This causes two annoyances.
Ideally, webhooks should have unique IDs. This will work around this issue.
TeamCity's buildTypeUnregistered
event is poorly named
In 2.0.1, I added a listener for the buildTypeUnregistered
event. This allows me to cleanup webhooks that might be interested in a build that has been deleted.
However, I have found that this event is fired when the VCS configuration is reloaded. TeamCity appears to re-read the VCS settings, delete all builds on that project, and then recreate them with the same internal and external IDs. At which point the buildTypeUnregistered
fires for each build.
When I see that event, I diligently clean up any webhooks that explicitly reference that build, and remove that build (btxxxx) id from their configuration.
This presents as webhooks having 0 builds registered. Sometimes webhooks still have builds showing. This is because a project edit (either via VCS or from the UI) fires up to 10 events in quick succession. I have a queue backed delay and de-bounce algorithm to only present one of each type of event.
I therefore only see one buildTypeUnregistered
and one projectRestored
, but they will fire at almost the same time. At which point a race condition exists as to whether the cache is refreshed while the build is marked as deleted or has been "undeleted".
I have removed the buildTypeUnregistered
event listener, and builds now reliably show for all webhooks.
I will do some further testing to make sure that not listening for this event has no problematic side-effects.
I'll do more testing on various configurations. I hope to have a very early beta release of 2.0.2 available for testing in 24 hours.
Builds on this branch are available to download on the jetbrains TeamCity instance. You don't need an account to download. Just choose guest mode.
Would it be possible to install this on your server and test it? I am pretty confident this will fix the issues you are seeing. If you do see further issues, I would appreciate any information you could share.
Thanks again for all your help thus far.
Hi @netwolfuk , thanks for the updated plugin to test. It looks like this 2.0.2
version has fixed the issue of ghost webhooks being created on VCS load/sync.
I did notice this behaviour. When the VCS would load, 1 webhook (always appeared to be the last on the list, but not confirmed if this is a pattern yet) would be missing Trigger on Events.
However, when I would click edit, the WebHook Config was as expected.
When I would hit refresh on the Webhooks & Templates page, the Build Events would appear correct and as expected.
I'll keep having a play around to see if I find any other strange behaviours.
Thanks
Ooh. That's an interesting bug. I don't think I've done anything that would have changed that.
Thanks for reporting it and I'll do some testing over the weekend.
Thanks again for all your great replies. I very much appreciate it.
Hi @jacobs-xero On the weekend I have built up two new servers with the intention of running them as a primary/secondary node setup.
Do you happen to be running a two server TeamCity setup? This is what we're seeing on #248 and it seems to be interfering.
Also, I noticed that editing project that is backed by VCS is unreliable. I tried to fix some build steps and my changes appeared to be saved, but were never persisted, and were never written back to git. Have you seen this behaviour, or do you mainly edit the project from the kotlin files and sync from VCS to TeamCity rather than the other way around?
Hi @netwolfuk , we only have the single server TeamCity setup.
We mainly edit the project via Kotlin files and sync from VCS to TeamCity. That's when we started seeing similar inconsistency with what was in the code vs UI (ghost webhooks and unchecked projects etc)
Hi @jacobs-xero Did you happen to notice if the webhooks fired on the events that were missing.
What I mean is... If the above build ran, did the webhook trigger for build started
?
Looking at the code, these are resolved just at the page is rendered, and just as the webhook runs, so there is a chance that the webhook will have picked up the correct setting when it fired.
You could have a look in the audit log. That would tell you when VCS settings reloaded, and then have a look at the build overview page (not a specific build) on the WebHooks Tab. It should show the events it tried to trigger for (see the bottom of this screenshot).
The table at the bottom of the above screenshot is rendered from the last 10,000 webhook events, which are then filtered to show just the ones from that build configuration. So if there have been lots of webhooks fired on other builds, then they might not show up in that table.
Hi @netwolfuk , I checked in with the team, and they've said that when the Build Events are missing, the webhooks stop firing.
Expected Behavior
Configured webhooks should be configured correctly and editable.
Current Behavior
Configured webhooks produce an error when trying to edit
Configured webhook has lost all it's settings
Under Versioned Settings, Loading project settings from VCS can re-create the webhooks with the correct settings. However leaves these "ghost" webhooks to persist until a service restart.
Extent of the breaking varies If a webhook is defined in code and that definition is not altered, it stays in place and can be edited, but it loses some configuration and needs to be fixed If a webhook is not defined in code (eg for subprojects, because we have no way of defining that in code), then the webhook just becomes uneditable
Steps to Reproduce (for bugs)
Your Environment