Closed peterzhuamazon closed 2 years ago
20220204: FPM is having some issues along the way:
%files
section instead of owning folders, so erase of the package will not do proper clean up.--rpm-summary
for RPM, and debian has a different command, which is not suitable for the usages.--rpm-os
is needed as another specific command that debian version does not own.Overall, it is a great tool for quick building an RPM without spec file. However, for our production usages I think using rpmbuild directly probably better here.
Still testing.
It can finally run now with rpmbuild
integrated:
PR add rpm block in assemble workflows: https://github.com/opensearch-project/opensearch-build/pull/1726
PR add rpm block in assemble workflows: https://github.com/opensearch-project/opensearch-build/pull/1726
List of issues:
20220324:
20220324:
OSD now support building from opensearch-build for rpm:
New issue 20220407:
In 2.0 core branch we have this change from main, and it is not backported to 1.3 branch: https://github.com/opensearch-project/OpenSearch/pull/875
In 1.3 branch, due to above change to check source
condition does not exist, we are forced to add these temp fix so that 1.3.x can install plugin zips correctly:
https://github.com/opensearch-project/opensearch-build/blob/main/src/assemble_workflow/bundle_rpm.py#L66-L76
https://github.com/opensearch-project/opensearch-build/blob/main/src/assemble_workflow/bundle_rpm.py#L91-L94
In 2.0 because of this change, the above fix in 1.3.x is actually causing 2.0.x to not able to install plugins. The if block needs to have actual code inside, and since our temp fix comment out source, it cause syntax error in if block:
2022-04-08 02:10:11 INFO Executing "/tmp/tmpmwqeotv_/opensearch-2.0.0-alpha1/bin/opensearch-plugin install --batch file:/tmp/tmpmwqeotv_/opensearch-job-scheduler-2.0.0.0-alpha1.zip" in /tmp/tmpmwqeotv_/opensearch-2.0.0-alpha1
/tmp/tmpmwqeotv_/opensearch-2.0.0-alpha1/bin/opensearch-env: line 98: syntax error near unexpected token `fi'
Traceback (most recent call last):
File "./src/run_assemble.py", line 58, in <module>
sys.exit(main())
File "./src/run_assemble.py", line 44, in main
bundle.install_components()
File "/var/jenkins/workspace/distribution-build-opensearch@4/src/assemble_workflow/bundle.py", line 77, in install_components
self.install_plugin(c)
File "/var/jenkins/workspace/distribution-build-opensearch@4/src/assemble_workflow/bundle_opensearch.py", line 22, in install_plugin
self._execute(f"{cli_path} install --batch file:{tmp_path}")
File "/var/jenkins/workspace/distribution-build-opensearch@4/src/assemble_workflow/bundle.py", line 106, in _execute
subprocess.check_call(command, cwd=self.min_dist.archive_path, shell=True)
File "/usr/local/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '/tmp/tmpmwqeotv_/opensearch-2.0.0-alpha1/bin/opensearch-plugin install --batch file:/tmp/tmpmwqeotv_/opensearch-job-scheduler-2.0.0.0-alpha1.zip' returned non-zero exit status 2.
Proposed fix: Backport this change from 2.0 into 1.3 branch of OpenSearch core. When we release 1.3.x rpm we will be able to consume this change. And since this change is only for rpm not for tar in opensearch core, it wont affect the released tar/docker artifacts.
We can then remove the temp fix code in build repo for both 1.3.0 and 2.0.0 to success.
Thanks.
@dblock @CEHENKLE @saratvemulapalli Do you have any suggestions for the above issue?
@peterzhuamazon @bbarani I did take a look at the PR. From what I understand, it make the standard path configurable to run multiple OpenSearch instances on the same host.
I don't see any harm to have this backported to 1.3 line. As it was called out in the PR, lets make sure this is documented for 1.3 line (more specifically 1.3.2 where this change will be going into).
@bbarani @tianleh @saratvemulapalli Backport this commit to 1.3 branch, means 1.3.2 now has this change. We have three options right now:
If we still want to release 1.3.1 then we have to hardcode different options in src/assemble_workflow/bundle_rpm.py
for 1.3.1 and non-1.3.1 versions. If 1.3.1, comment out source; if not, skip backup/comment.
If we still want to release 1.3.1, but ok to create a temp branch such as 1.3.1-temp or 1.3.1-rpm
in core repo, we can cherry pick the 1.3.1 tag and only backport this commit. We will also create a temp 1.3.1-temp or 1.3.1-rpm
manifest file only for this branch. This is a one-time setup and wont happen later. We just need to remove the backup/comment code in src/assemble_workflow/bundle_rpm.py
.
If we are ok to release 2.0.0 first, then 1.3.2, completely ignore 1.3.1 rpm release then we can just backport the change to 1.3 core branch, and remove the backup/comment code in src/assemble_workflow/bundle_rpm.py
.
Thanks.
@tianleh @saratvemulapalli @dblock @CEHENKLE @anasalkouz Which option do you suggest to proceed further?
I vote for (3) because there are no hacks. Generally I am against trying to do an RPM release for something that has already shipped (e.g. 1.3.1). We should only be moving forward.
+1 to @dblock. I like 3, as it looks clean. In terms of the product, did we commit to release 1.3.1 RPM?
We have committed to releasing 1.3.x RPM but we didn't commit to a specific version. @CEHENKLE @elfisher @setiah Are you ok with delaying the RPM for 1.3.x release and publish it after 2.0 (RC1) release?
@bbarani can you comment on the forum thread point people here, asking for their input?
I vote for (3) because there are no hacks. Generally I am against trying to do an RPM release for something that has already shipped (e.g. 1.3.1). We should only be moving forward.
I'm leaning towards 1.3.1-temp branch. If we go this route, we should be able to release RPM in mid-April. If we wait for 1.3.2, then we can't release until 1.3.2 is released (which is ~30 days). This gives us a short lived branch which can be deleted after 1.3.2 goes out, and sits a lot better with me than hardcoding something in our build script.
(BTW, another option is to release 1.3.2 early, but depending on if they plugins have changes, I'd hate to add that to the work for 2.0)
I'm leaning with 1.3.1-temp. Since this is the first release of RPM on OpenSearch, I think it is okay with it to be out of band from the regular process and waiting for 1.3.2 will be too long. Also To @CEHENKLE's point, if there was a way to do 1.3.2 early, that would also help with getting this out sooner, but I don't know how feasible that will be.
I'm leaning with 1.3.1-temp. Since this is the first release of RPM on OpenSearch, I think it is okay with it to be out of band from the regular process and waiting for 1.3.2 will be too long. Also To @CEHENKLE's point, if there was a way to do 1.3.2 early, that would also help with getting this out sooner, but I don't know how feasible that will be.
The issue is that 1.3.whatever is not going to be 1.3.1 binaries + RPM distribution, it will be built from source. Effectively creating 2 different 1.3.1s which I think is worse.
Would it be possible to rename the WIP 1.3.2 to 1.3.3, and re-release 1.3.1 + RPM as 1.3.2?
I guess this is a first heard from me that there would even be a 1.3.X (where X is defined) release number of when an RPM would be released. I had seen in the 1.3.0 release notes that RPM was removed from 1.3.0 and no update was given as to the release of the feature. I also saw in the forums that there was no definitive release where RPMs would be available and no estimated date, but that when it was it determined, it would be updated. I did see some type of a commitment that there would be some 1.X release with RPM, and since there doesn't seem to be a 1.4 planned, I guess it makes some sense it would be 1.3.X. I can't find the release date for the 1.3.X series (other than the release 1.3.0) on https://opensearch.org/blog/partners/2022/02/roadmap-proposal/ or https://github.com/orgs/opensearch-project/projects/1 or in the announcements in the forums https://discuss.opendistrocommunity.dev/c/announcements/5, so I apologize if there is some post somewhere where I should be able to find this but can't. I surmise from the discussion that 1.3.1 is planned before 2.0.0RC1 (2022 April 26, per the github roadmap) and 1.3.2 some time after that, but no clear indication on when that might be. Per @elfisher, I further surmise it will be a while for 1.3.2.
While the software engineer in me yearns for clean and (3) appears to be the cleanest of the solutions, we live in an imperfect world. If we were still pre 1.0 GA and were trying to select which implementation to launch with, I think (3) fits the bill. However, while (3) gives the cleanest release code, it also perpetuates the problem of the missing RPMs for the entire duration we get to hold the moral high-ground of living without any less than perfect solutions. Of course, the irony is we do so by giving no solution to the end-users for the feature. I would urge an expedient as prudent release of the feature. Like most things in life, we learn by doing, and deferring this will likely lead to deferring the identification and resolution of other issues that will no doubt need to be addressed after this one. I +1 @elfisher's vote for (2), understanding it is less than ideal.
@justchris1 https://github.com/orgs/opensearch-project/projects/13 This is the panel we use to track distributions.
Thanks.
Please don’t make 1.3.1-temp branches. By the time you’ve finished (2) nobody will remember what happened and it will be impossible to reproduce. Why are we making our lives harder with hacks upon hacks? Just release 1.3.2 which is 1.3.1 + rpm distribution, a new feature. There are code changes, it’s not 1.3.1. What’s the big deal to make that 1.3.2 on the same schedule and make 1.3.3 what we planned for 1.3.2?
If we had a security vulnerability we’d just make a 1.3.2 release now in automated orderly fashion, we do this all the time.
@dblock Fast tracking the patch release schedule to add RPM is a hack to patch release cadence as well. We do schedule emergency patch releases for critical security vulnerabilities and high severity bug fixes. I understand that we are fixing bugs related to RPM generation here but does it warrant a new release? We are talking about releasing new version for both OpenSearch and OpenSearch Dashboards to support this change even though this code change is specific to OpenSearch only. Releases are expensive (From developer and also from consumer perspective) and I’m ok with fast tracking the patch release if the situation warrants it ( and there are no feasible workarounds)
@peterzhuamazon Thanks - I have been loosely tracking that. Do you happen to know if there is somewhere I can find information about planned release dates for 1.3.1, 1.3.2, or any 1.3.X releases? It seems like some people on the thread know of some of these dates (either notionally or with real plans), but I can't find them anywhere.
@dblock I understand that we are fixing bugs related to RPM generation here but does it warrant a new release?
Yes it does because it's not the same code.
What’s the big deal to make that 1.3.2 on the same schedule and make 1.3.3 what we planned for 1.3.2?
Are you suggesting that we have a simultaneous 1.3.1 and 1.3.2 release? I guess I could see that alleviating the need for some different paths like (1) or (2), but I was assuming that was not workable per some earlier comments. I wonder what the use of 1.3.1 would be if 1.3.2 was released simultaneously. It seemed that if we did (3), we would be further punting RPM support to an indeterminate date to the right again. (Not that RPM has any determinate release after being punted from 1.0, 1.1, 1.2, and 1.3 releases - not to mention any release plan per the release notes of 1.3).
I think the question illustrates the kind of confusion we have if we release RPM 1.3.1, 1.3.1 has already been released - https://opensearch.org/docs/latest/version-history/
I don't think we should punt anything, all I am saying is that we should release 1.3.2 TAR and RPM as soon as that's ready. That's the only logical path in my brain, I don't know how to grok "1.3.1 was released March 30th, then 1.3.1 RPM is released April 15th at the same time as taxes were due except when in 2020 there was an extra 15 days grace period" :)
I think the question illustrates the kind of confusion we have if we release RPM 1.3.1, 1.3.1 has already been released - https://opensearch.org/docs/latest/version-history/
Oh, I see. I can't use this project until there is real support from RPMs so I didn't even notice the 1.3.1 release.
I don't think we should punt anything, all I am saying is that we should release 1.3.2 TAR and RPM as soon as that's ready.
Yeah, I guess I can't argue with 'as soon as that's ready' except to note that the community desperately needs this feature, it is the largest missing element/feature that never made it over from Elasticsearch, and it is long promised but seems to be continuously deprioritized. As Kyle stated in the forums in November 2021 "RPM and DEB are super vital to serious use for a lot of folks", which I agree completely.
So just to frame the question here. There are three dimensions we can optimize against:
1) RPM is a critical project. The absence of RPM causes real pain. So we want to get it out as soon as possible. 2) We don't want to slam a hack into production that will cause us more pain down the line. 3) We firmly believe in progress, not perfection. If there's something that solves our needs, but could be better, let's get it out and improve it.
Reading the comments, @dblock it seems like you think the second option (a temp branch) violates number 2 above so much that it outweighs the benefits to 1 and 3. Other voices (mine included) seem to think the second option balances the three concerns the best. Do you have another option that we haven't considered that balances the three concerns better? Or do you still think waiting for 1.3.2 (which optimizes for 2, but is less effective on 1 and 3) is the best option?
(and just a side note: Making 1.3.1 available means making the current version of the 1.x line available in RPM, rather than "re-releasing". Will they always be available on the same day going forward when we release? yes. Should we wait to make that true starting now? I don't think so, but that's how my brain squares that circle, dB).
Just catching up here.
Making a 1.3.2 because it has RPM is violation of semver as you're treating it like a feature. If you don't rev the version, then it's not a feature, just like if we added a new build to support RISC-V or something. Patches can only fix bugs - this isn't a bug.
Hey folks. -- I've slept on it, and we'll make the first RPM 2.0-RC1, and then follow up with a release on 1.3.2. I hate the delay viscerally, but I can't justify the hackery/extra churn on the EE team for the time saved.
Thanks. /C
Hey folks. -- I've slept on it, and we'll make the first RPM 2.0-RC1, and then follow up with a release on 1.3.2. I hate the delay viscerally, but I can't justify the hackery/extra churn on the EE team for the time saved.
Thanks. /C
Don't worry. No one will be too surprised.
Just catching up here.
Making a 1.3.2 because it has RPM is violation of semver as you're treating it like a feature. If you don't rev the version, then it's not a feature, just like if we added a new build to support RISC-V or something. Patches can only fix bugs - this isn't a bug.
So should we do a 1.4.0 then? I don't mean to throw us in a loop, ... but.
Isn't adding RPM in 1.3.1 the same problem as in 1.3.2? So should it be 1.4.0?
Two backports:
- [1.x backport] Fix opensearch-env always sources the environment from hardcoded file in #875 OpenSearch#2908
- [1.3 backport] Fix opensearch-env always sources the environment from hardcoded file in #875 OpenSearch#2909
Remove Hacks:
Looks like we finally backported the PR that was the source of contention to 1.3. Does this imply the plan is to release 1.3.1 version rpm along with the 2.0.0 rc1 rpm?
Hi @setiah option 3 is what we are going with: https://github.com/opensearch-project/opensearch-build/issues/1545#issuecomment-1099632988
Thanks.
@dblock So, IIRC, the original intention of having the distributions tracked away from numbered versions in the roadmap is that they would not have an effect on the actual logic/features of OpenSearch and could be released out of phase with any particular version. That's fine. No problem with SemVer. Use build metadata that is outside the scope of major.minor.patch.
As you're suggesting revving the patch to accommodate this fix doesn't fit the word nor the spirit of SemVer (PATCH version when you make backwards compatible bug fixes ... not this situation at all). Revving the minor doesn't really fit either (MINOR version when you add functionality in a backwards compatible manner ... not really either). I'm suggesting that the project use the build metadata appropriately as per spec. For someone who doesn't care about RPM this version is meaningless. Organizations have policies they have to follow as far as adopting the latest patch, so this is creating waste heat for those users.
I'm uncomfortable with the idea of bringing builds back into the numbered versioning scheme yet tracking them in a different place - it's really violating the trust users have in our roadmap and our versioning. All the (rightful) hubbub about RPM being in 1.3.0 roadmap is also meaningless since it was made clear that the distributions shouldn't have been in the numbered version roadmap, and now the project is attaching RPM to a patch version?
Thoughts @elfisher @CEHENKLE ?
Oh, I see. I think 1.3.2 should also get some low hanging CVEs.
@dblock So, IIRC, the original intention of having the distributions tracked away from numbered versions in the roadmap is that they would not have an effect on the actual logic/features of OpenSearch and could be released out of phase with any particular version. That's fine. No problem with SemVer. Use build metadata that is outside the scope of major.minor.patch.
As you're suggesting revving the patch to accommodate this fix doesn't fit the word nor the spirit of SemVer (PATCH version when you make backwards compatible bug fixes ... not this situation at all). Revving the minor doesn't really fit either (MINOR version when you add functionality in a backwards compatible manner ... not really either). I'm suggesting that the project use the build metadata appropriately as per spec. For someone who doesn't care about RPM this version is meaningless. Organizations have policies they have to follow as far as adopting the latest patch, so this is creating waste heat for those users.
I'm uncomfortable with the idea of bringing builds back into the numbered versioning scheme yet tracking them in a different place - it's really violating the trust users have in our roadmap and our versioning. All the (rightful) hubbub about RPM being in 1.3.0 roadmap is also meaningless since it was made clear that the distributions shouldn't have been in the numbered version roadmap, and now the project is attaching RPM to a patch version?
Thoughts @elfisher @CEHENKLE ?
+1, I agree the distribution release process should remain independent of the version release process. This helps with not just the RPM but other distributions such as Windows, Debian as well in future. Else, we could be falling prey to such issues down the line again while releasing other distributions.
@setiah except it's not separate in practice, the rpm process is implemented in a tightly coupled way to the tar release process
Python code can now generate RPMs for both OS and OSD on every version pass 1.3.0.
20220429: Remove PA not properly stopped before install/uninstall
20220204: We might change to rpmbuild directly without using FPM. See comments.