Closed jwikman closed 3 years ago
Same here. Maybe five times instead of ten but it really can be tough to identify the source of the problem if this happens
Can you go back and see some of the other failed instances? Is it always failing in this location? If not, can you provide other examples? Thanks
No need - I know what is wrong here. Will fix this in the next generic image.
The problem here is in start.ps1 in the generic image. Originally this was designed to ignore failures when installing/running BC in order to keep the container running and allow you to get access to eventlog, reconfigure and restart the service tier etc. This means that any error happening during install is ignored. When building an image this of course should not happen. I will deploy new generic images with this fix one of the next days.
BTW - it could still be relevant to check whether the common error here is AuthorizationManager failed. If that is the case, then that is probably a timing issue. AuthorizationManager failed is thrown when a file is blocked or when Windows somehow has a lock on the file.
I have seen before that when copying stuff into a container, it sometimes takes a little while until things are available in the container.
I will make the first import-module (after copying) a little more resilient (wait 10 seconds and retry) This fix will be included in the next generic.
If you set this configuration setting:
{
"genericImageName": "mcr.microsoft.com/businesscentral:{0}-dev"
}
Then you wll get generic 1.0.1.3 preview, Likely to be shipped within the next few days.
Thanks Freddy
Since there was no error in the image rebuild pipeline when hitting this issue, it will be quite hard to find the last occurrences of this... When we have hit strange issues when creating a container from images we have just dropped the image and recreated them. But this time I realized why we had faulty images and reported here instead...
Since we are creating images in parallell, it's likely a timing issue where the different pipelines are interfering with each other.
I suggest that we will wait until this starts to throw errors instead - and then we'll report specific issues when we get any. Ok?
Ok, great - and if you are saved by the delay you will see: Error: 'AuthorizationManager check failed.', Retrying in 10 seconds...
I realized that it was not that hard to find those occurrences after all. The image rebuild jobs that has this issue takes about 5 to 7 minutes and the jobs that succeeds takes 10+ minutes.
So I just looked through the last month of this pipeline runs and it turned out that this issue was more common than I first thought, but it was just on localizations that we hadn't been using when they were faulty. I saw this 10+ times in the last month...
But all was with the "AuthorizationManager check failed." error - so hopefully your fix will save us from this!
gr8, and if the retrying doesn't work, we will have a look at why then later...
BTW - let me know when you have run a few pipelines on the new -dev image, thanks. I have run all my tests, they work and the changes aren't that big.
Last night I deleted all images and ran the pipeline - it created 30+ images without issues. And it was using Generic Tag: 1.0.1.3
But I could not find any retries after "AuthorizationManager check failed", so we probably need to wait a week or so until that shows upp in the logs...
The nextminor images has been used in a a lot of pipelines in our scheduled builds this morning, and they seems fine.
I saw your code change, nice and small. It should be fine..
I will roll out 1.0.1.3 to public this week, thanks.
Describe the issue We're running a scheduled pipeline every night that are recreating images for all localizations if needed (new version released). Now and then there are some "dancing errors" when the image are being created, but the image is still created. If I rerun the same pipeline to try again it has always been working, hence the "dancing errors" term... ;-)
I would like the image build to fail if there are any issues, whatever they are, when the image is built.
As it is right now the image is created (even if f.ex. there are no BC service installed) and then all our pipelines that uses this image will fail with strange errors. If there instead would be no image for the localization a pipeline is running against, the image will be created on-the-fly (and fail if the second image creation fails as well, but that is rare)
I got an example of this last night.
Scripts used to create container and cause the issue Not that meaningful, since running the same script again works...
Full output of scripts I don't think that the full output is meaningful, since it's not solving this particular error that is the purpose here, it's just a general error handling that is missing. Let me know if you want full output anyway... Example of creation of a image that had an error below.
As you can see above, something failed in step 5,
AuthorizationManager check failed
, but it continues to create the image. Now we've got an image that has no BC installed...Additional context