Closed nmaludy closed 2 years ago
Raised https://github.com/StackStorm/st2-packages/pull/665 for failing to install previous stable version with one-line installer and -v (--version works fine though). I don't think it's a blocker though, but fix in PR.
missing changelog in st2chatops: https://github.com/StackStorm/st2chatops/issues/158 also st2web's changelog appears to be totally unused
@nmaludy Some of those st2 web PRs aren't PRs merging into master,but merging into a feature branch. e.g. https://github.com/StackStorm/st2web/pull/807 - they are the merges into the workflow composer feature branch. Not all of them but quite a few...
https://github.com/StackStorm/st2-packages/pull/666 raised for fact stable tries to intsall 3.2 with 3.3 scripts..
Need to ensure that people migrating form MongoDB 3.4 (previously supported version) follow the upgrade path:
3.4 -> 3.6 -> 4.0
https://docs.mongodb.com/manual/release-notes/3.6-upgrade-standalone/ https://docs.mongodb.com/manual/release-notes/4.0-upgrade-standalone/
submitted issue: https://github.com/StackStorm/st2docs/issues/1026
Testing so far on CentOS8 after a bash single line install good. No problems found with UI. Planning to do an upgrade from CentOS 7 3.2.0 bash single line install -> 3.3dev
I did a testing of CentOS 7 from 3.2.0 -> 3.3.0 and no issues there (puppet-st2 managed).
CentOS 8 ansible install all good.
CentOS 7 ansible looking good too.
Ubuntu 16.04 ansible install looked good. I found a problem on the web-ui with responding to inquiries and not selecting a field. But have also verified that it is a legacy problem and exists on my 3.2.0 install (https://github.com/StackStorm/st2web/issues/809)
Also verified the dig fix on CentOS 8. Reproduced failure on 3.2.0, and passed on 3.3.0dev (after installing bind-utils which wasn't installed as a dependency...). Not sure if we want to install that by default, but had to install manually.
Vagrant commands to test various installs (used on my RHEL box so ignore --provider=libvirt
if you want to use VirtualBox):
BOX=centos/7 RELEASE=unstable VERSION=3.3dev vagrant up --provider=libvirt
BOX=centos/8 RELEASE=unstable VERSION=3.3dev vagrant up --provider=libvirt
BOX=generic/ubuntu1604 RELEASE=unstable VERSION=3.3dev vagrant up --provider=libvirt
BOX=generic/ubuntu1804 RELEASE=unstable VERSION=3.3dev vagrant up --provider=libvirt
Testing the boxes after they're up:
$ vagrant ssh
[vagrant@st2vagrant ~]$ sudo su -
[root@st2vagrant ~]# sudo ST2_AUTH_TOKEN=$(st2 auth st2admin -p 'Ch@ngeMe' -t) /opt/stackstorm/st2/bin/st2-self-check
@nmaludy Did you test on ubuntu 20.04? I thought we had problems with it's newer python 3? (supported ubuntus should be 1604 and 1804 I believe for 3.3).
@amanda11 no i haven't yet, sorry copy/paste fail, i'll change that to 1604
Self-check passed on all the various boxes!
SELF CHECK SUCCEEDED!
st2-self-check succeeded.
#############################################################
################################################### #######
############################################### /~\ #####
############################################ _- `~~~', ####
########################################## _-~ ) ####
####################################### _-~ | ####
#################################### _-~ ; #####
########################## __---___-~ | #####
####################### _~ ,, ; `,, ##
##################### _-~ ;' | ,' ; ##
################### _~ ' `~' ; ###
############ __---; ,' ####
######## __~~ ___ ,' ######
##### _-~~ -~~ _ ,' ########
##### `-_ _ ; ##########
####### ~~----~~~ ; ; ###########
######### / ; ; ############
####### / ; ; #############
##### / ` ; ##############
### / ; ###############
# ################
Ansible install on Ubuntu 18.04 successful, and quick run-through on UI good.
Also did a quick chatops test with slack yesterday on one of the platforms (now forgot which O/S!)
Did a few manual tests and picked up a couple platforms to try. Here is the report:
EL6
should not installMistral
& PostgreSQL
should not installEL7
install and script flow checkU18
install and script flow checkEL7
vs U18
) - the message is very visible and helpful :+1: 3.2dev
-> 3.3dev
. While the upgrade went well, we'll still need to provide the st2docs
migration scripts to remove both mistral
and postgresql
: https://docs.stackstorm.com/install/upgrades.html#version-specific-migration-scripts (see https://github.com/StackStorm/st2docs/issues/1027) as it'll be a question in community anyway after releasing v3.3.@armab st2docs PR for the comments above is started here: https://github.com/StackStorm/st2docs/pull/1028/
Just deployed using st2-docker
and able to verify few of the items already like no St2-Api-Key in the header getting logged! Will definitely consider socializing this to other developers as their templated dev environment before the custom pack is released for the K8s Deployment in the prod. Am planning to get the latest images for stand alone K8s envt to see how this goes. Production cluster has been running on 3.3dev for at least 2 months now and so far so good! Appreciate all the efforts!
@punkrokk found the issue https://github.com/StackStorm/st2/issues/5057 and i have implemented PRs to fix this:
https://github.com/StackStorm/st2/pull/5058 (into master) https://github.com/StackStorm/st2/pull/5059 (cherry pick for v3.3)
I tested CentOS8. Looks good.
Found a bug in st2ci when trying to run the e2e upgrade test for el8. We need to pass -y
in order to import the GPG key for the StackStorm/staging-stable repo. PR is here: https://github.com/StackStorm/st2ci/pull/191
Found an issue in st2-self-check where it was invoking actions and reporting back "OK" status, but in fact the action failed:
Specifically the action that failed was Attempting Test tests.test_timer_rule...OK! (44s)
The action failed because on the vagrant box for libvirt: generic/ubuntu1604
and generic/1804
the timezone is not set on the box causing the st2timersengine
service to not start. Easy fix, simply set the timezone on the vagrant box and restart st2timersengine (this is not a problem on the virtualbox Ubuntu image).
However, the st2-self-check reported success (as seen above), when you check the WebUI the action in fact failed.
TODO: write up an issue for this, investigate and fix
We're ready to prepare the StackStorm
v3.3
release and starting pre-release testing..Release Process Preparation
Per Release Management Schedule @nmaludy is the Release Manager and @blag Assisting for v3.3. They will freeze the
master
for the major repositories in StackStorm org, follow the StackStorm Release Process which is now available to public, accompanied by the Useful Info for Release managers. Communication is happening in#releasemgmt
and#development
Slack channels. The first step is pre-release manual user-acceptance testing forv3.3dev
.Why Manual testing?
StackStorm is very serious about testing and has a lot of it: Unit tests, Integration, Deployment/Integrity checks, Smoke tests and eventually end-2-end tests when automation spins up new AWS instance for each OS/flavor we support, installs real st2 like user would and runs set of st2tests (for each st2 PR, nightly, periodically, during release).
That's a perfect way to verify what we already know and codify expectations about how StackStorm should function.
However it's not enough. There are always new unknowns to discover, edge cases to experience and tests to add. Hence, manual Exploratory Testing is an exercise where entire team gathers together and starts trying (or breaking) new features before the new release. Because we're all different, perceive software differently and try different things we might find new bugs, improper design, oversights, edge cases and more.
This is how StackStorm previously managed to land less major/critical bugs into production.
TL;DR
Install StackStorm
v3.3dev
unstable packages, try random things in random environments (different OS) and report any regressions found comparing tov3.2
:Extra points for PR hotfixes and adding new or missing test cases.
Major changes
st2-docker
revamp based onst2-dockerfiles
Full Changelog
Changes which are recommended to ack, explore, check and try in a random way.
st2
Added
Add support for a configurable connect timeout for SSH connections as requested in #4715 by adding the new configuration parameter
ssh_connect_timeout
to thessh_runner
group in st2.conf. (new feature) #4914This option was requested by Harry Lee (@tclh123) and contributed by Marcel Weinberg (@winem).
Added a FAQ for the default user/pass for the
tools/launch_dev.sh
script and print out the default pass to screen when the script completes. (improvement) #5013Contributed by @punkrokk
Added deprecation warning if attempt to install or download a pack that only supports Python 2. (new feature) #5037
Contributed by @amanda11
Added deprecation warning to each StackStorm service log, if service is running with Python 2. (new feature) #5043
Contributed by @amanda11
Added deprecation warning to st2ctl, if st2 python version is Python 2. (new feature) #5044
Contributed by @amanda11
Changed
Switch to MongoDB
4.0
as the default version starting with all supported OS's in st2v3.3.0
(improvement) #4972Contributed by @punkrokk
Added an enhancement where ST2api.log no longer reports the entire traceback when trying to get a datastore value that does not exist. It now reports a simplified log for cleaner reading. Addresses and Fixes #4979. (improvement) #4981
Contributed by Justin Sostre (@saucetray)
The built-in
st2.action.file_writen
trigger has been renamed tost2.action.file_written
to fix the typo (bug fix) #4992Renamed reference to the RBAC backend/plugin from
enterprise
todefault
. Updated st2api validation to use the new value when checking RBAC configuration. Removed other references to enterprise for RBAC related contents. (improvement)Remove authentication headers
St2-Api-Key
,X-Auth-Token
andCookie
from webhook payloads to prevent them from being stored in the database. (security bug fix) #4983Contributed by @potato and @knagy
Fixed
Fixed a bug where
type
attribute was missing for netstat action in linux pack. Fixes #4946Reported by @scguoi and contributed by Sheshagiri (@sheshagiri)
Fixed a bug where persisting Orquesta to the MongoDB database returned an error
message: key 'myvar.with.period' must not contain '.'
. This happened anytime aninput
,output
,publish
or contextvar
contained a key with a.
within the name (such as with hostnames and IP addresses). This was a regression introduced by trying to improve performance. Fixing this bug means we are sacrificing performance of serialization/deserialization in favor of correctness for persisting workflows and their state to the MongoDB database. (bug fix) #4932Contributed by Nick Maludy (@nmaludy Encore Technologies)
Fix a bug where passing an empty list to a with items task in a subworkflow causes the parent workflow to be stuck in running status. (bug fix) #4954
Fixed a bug in the example nginx HA template declared headers twice (bug fix) #4966 Contributed by @punkrokk
Fixed a bug in the
paramiko_ssh
runner where SSH sockets were not getting cleaned up correctly, specifically when specifying a bastion host / jump box. (bug fix) #4973Contributed by Nick Maludy (@nmaludy Encore Technologies)
Fixed a bytes/string encoding bug in the
linux.dig
action so it should work on Python 3 (bug fix) #4993Fixed a bug where a python3 sensor using ssl needs to be monkey patched earlier. See also #4832, #4975 and gevent/gevent#1016 (bug fix) #4976
Contributed by @punkrokk
Fixed bug where action information in RuleDB object was not being parsed properly because mongoengine EmbeddedDocument objects were added to JSON_UNFRIENDLY_TYPES and skipped. Removed this and added if to use to_json method so that mongoengine EmbeddedDocument are parsed properly.
Contributed by Bradley Bishop (@bishopbm1 Encore Technologies)
Fix a regression when updated
dnspython
pip dependency resulted in st2 services unable to connect to mongodb remote host (bug fix) #4997Fixed a regression in the
linux.dig
action on Python 3. (bug fix) #4993Contributed by @blag
Fixed a bug in pack installation logging code where unicode strings were not being interpolated properly. (bug fix)
Contributed by @misterpah
Fixed a compatibility issue with the latest version of the
logging
library API where thefind_caller()
function introduced some new variables. (bug fix) #4923Contributed by @Dahfizz9897
Removed
Removed
Mistral
workflow engine (deprecation) #5011Contributed by Amanda McGuinness (@amanda11 Ammeon Solutions)
Removed
CentOS 6
/RHEL 6
support #4984Contributed by Amanda McGuinness (@amanda11 Ammeon Solutions)
codecov-python
for CI and have switched back to the upstream version (improvement) #5002orquesta
Fixed
st2chatops
st2web
Conclusion
Please report findings here and bugs/regressions in respective repositories. Depending on severity and importance bugs might be fixed before the release or postponed to the next release if they're very minor and not a release blocker.
Issues Found During Release
PRs Merged for Release
TODOs