Closed lorengordon closed 1 year ago
I don't want to get into the weeds of how the sausage gets made, but to give some insight - at the time that we cut 1.247352.0 for release, the CloudWatch agent could not be built on anything past 1.16, but not putting the upper bound would have it attempt to build on 1.17 (thus blocking release). We've since upgraded our dependencies and moved to 1.18 to stay on a supported version of Go.
Strange. I am pretty sure I built it previously using 1.16 and it was fine. I'm building on/for RHEL, so I just installed golang from EPEL. It was 1.16 until recently. It is now 1.17.
Edit: No, checked the build logs. As recently as 5 July, it was golang 1.17 in EPEL and was still building v1.247350.0 fine.
Building on 1.16 would be fine. The problem is outdated dependencies that the agent uses under-the-hood would not build on Go 1.17, so we were stuck juggling releasing v352 with ripping everything out and making it all work on Go >= 1.17. Apologies for the inconvenience. Just wanted to shed some light on why the upper limit was imposed. IIRC we removed it after that for 1.247353.0 since we jumped to Go 1.18 by then.
Alright, I can wait for 1.247353.0 to show up in Amazon Linux 2 and retry. Any idea when the new version will get published as an rpm to the Amazon Linux 2 repos?
I think mid-August is when you should expect a new version to be out, though that RPM is actually for one version further, v1.247354.0, just so you don't get blindsided by it when that happens - no go mod changes in that one.
So I saw the tag hit for 1.247354 (and then 1.247355), but I'm not seeing a GitHub Release for either. Is that normal? I also noticed a new version of the Amazon Linux 2 docker image landed a few days ago, so figured I'd give the build another try, but it appears it is still pulling 1.247352.0?
I think we've just been slacking on cutting proper GitHub releases.. I'm a little surprised about the v352 release being "new" for the Docker image though. Where did you pull it from? We definitely should have published a v354 image. Peeking behind the curtain a little, part of our release is updating the container insights repo, and then publishing it + running validation on it so I'm like fairly certain that it should be out.
We spin up the Amazon Linux 2 container, then use yumdownloader to grab the source rpm for each of the tools we want to rebuild. We are not pinning the version, so it should just be grabbing the latest. Here's the error from rpmbuild:
error: Failed build dependencies:
golang < 1.16.0 is needed by amazon-cloudwatch-agent-1.247352.0-1.el7.x86_64
docker pull amazon/cloudwatch-agent:1.247354.0b25198101
Error response from daemon: manifest for amazon/cloudwatch-agent:1.247354.0b25198101 not found: manifest unknown: manifest unknown
Huh. Well I guess that's worth looking into. I'm not sure how I didn't see an error during release.
Oh are you pulling from the yum repository? That's like a whole separate thing.
Oh are you pulling from the yum repository? That's like a whole separate thing.
Yeah, we're rebuilding several packages. So that seemed like the easiest and most standard way to get everything. And we just trigger builds on when the Amazon Linux 2 image is updated. Instead of having custom build logic per package...
Yeah that makes sense. I think we're missing the last two releases in Amazon Linux 2. I'll follow up on that.
As for the Docker pull, I had extra chars at the end which explain that:
docker pull amazon/cloudwatch-agent:1.247354.0b251981
1.247354.0b251981: Pulling from amazon/cloudwatch-agent
d875800c7401: Pull complete
265f36118970: Pull complete
a91d9a823c97: Pull complete
Digest: sha256:33f0072c93d614b5dd32f044549f3d764d05a42f068e852e94bdd849098852c7
Status: Downloaded newer image for amazon/cloudwatch-agent:1.247354.0b251981
docker.io/amazon/cloudwatch-agent:1.247354.0b251981
the newest image for v354 does exist in DockerHub / Public ECR.
Now, as for the v355 tag that you noticed, what happens is we tag it on GitHub, and then start the release process so you should probably expect a v355 to be released to S3/Docker in the next few weeks. But publishing to Amazon Linux's YUM repo is not as simple and not entirely controlled by us.
As a possible workaround, are you able to pull an older version of Golang so that the image build works for 352?
As a possible workaround, are you able to pull an older version of Golang so that the image build works for 352?
Yess-ish... Right now we're just using yum install golang
to install from epel, and the yum repo does not provide older versions. So we'd have to change that install mechanism to something that lets us specify the version... What I've done before to make that super easy is just a multi-stage docker build and copy over /usr/local/go
and /go
...
But of course that might cause other problems, for other packages... Ick. Might end up needing per-package logic after all. ðŸ˜
Sorry for the trouble this is causing you. v354 should be staged for the next release of Amazon Linux 2 (missed the last cutoff by like 2 days unfortunately), so I think early September is when this should be resolved for you
Thanks for all the follow-up and communication! Really appreciate it!
Just so you know that I haven't forgotten :) I just checked and it hasn't rolled out yet. sudo yum install amazon-cloudwatch-agent
still installs v352 right now. I will check in on it again next week. Sorry for the delay. AL2 updates aren't controlled by our team
Haha just yesterday I reran the build to see if it would pass yet (it didn't). Thanks for the confirmation!
I have been swamped so I forgot to check until now. It's definitely updated.
{
"status": "running",
"starttime": "2022-09-20T19:51:26+0000",
"configstatus": "configured",
"cwoc_status": "stopped",
"cwoc_starttime": "",
"cwoc_configstatus": "not configured",
"version": "1.247354.0b251981"
}
welp, yes, that part is good now. just instead of success, the error has changed :sob: :
#0 0.061 ~/rpmbuild/SOURCES ~/rpmbuild
#0 0.094 amazon-cloudwatch-agent.spec
#23 0.317 amazon-cloudwatch-agent.tar.gz
#23 0.317 62666 blocks
#23 0.319 ~/rpmbuild
#23 0.384 error: Failed build dependencies:
#23 0.384 golang >= 1.18.3 is needed by amazon-cloudwatch-agent-1.247354.0b251981-1.el7.x86_64
unfortunately epel7 has an older version of golang, 1.17.12:
#19 37.11 ---> Package golang.x86_64 0:1.17.12-1.el7 will be installed
so guess i really am going to have to update this process to track a specific golang version...
Sigh.. sorry about that. We jumped up to Go 1.18 to catch up with telegraf, so not much we can do on that front.
Ok, well now we're in new territory. I've gone ahead and installed a specific version of golang directly and made sure it is in the PATH:
#25 [builder 8/21] RUN go version
#0 0.078 go version go1.19.1 linux/amd64
Beautiful.
But still:
#29 0.510 error: Failed build dependencies:
#29 0.510 golang >= 1.18.3 is needed by amazon-cloudwatch-agent-1.247354.0b251981-1.el7.x86_64
So. I assume the spec file is still using BuildRequires
, and that is actually looking for an installed rpm of the correct name and version, instead of just using the binary available in the PATH? (My install was just grabbing the docker image for golang, copying over the bits to my image, and updating PATH and GOPATH. Which should be enough imo.)
Maybe it is hammer time. Yolo right?
sed -i "/BuildRequires: golang/d" "$SPEC"
Ok I think I'm unblocked on this now. I'll go ahead and close it. Not sure if you want to take for action some kind of improvement to the spec-file to make it possible to build the rpm without an rpm-packaged golang install. Appreciate all your support and communication, thanks!
Describe the bug I was building an rpm for the Cloudwatch Agent, using the spec from the srpm in Amazon Linux 2, and found that the spec file is setting an upper limit of golang that requires < 1.16.0 to build the rpm. That seems wrong, since this repo is declaring go 1.18 in go.mod.
Steps to reproduce
What did you expect to see? I expected to be able to build the rpm with a newer version of golang. When I built v1.247350.0 previously, it worked fine. And I checked the prior srpm and it does not specify the upper limit in BuildRequires. So this is new to the spec for v1.247352.0.