ngageoint / hootenanny

Hootenanny conflates multiple maps into a single seamless map.
GNU General Public License v3.0
353 stars 74 forks source link

Multiple issues with installation documentation #5304

Open marblerun opened 2 years ago

marblerun commented 2 years ago

Current Installation instructions are unclear, and the product fails to install at present with dependency issues.

In attempting to help our companies cartographer to evaluate Hootenanny, we have both tried multiple ways of installing and running the software on Centos 7 based physical and virtual machines.

These are some of the current issues

On a physical intel server, 8 cores and 32 gb Mem, running Centos 7.9.2009 and following these instructions

https://github.com/ngageoint/hootenanny-rpms/blob/master/docs/install.md

Setting up to use the S3 hosted repos, fails with multiple dependency issues, using el7/release repo

eg

--> Finished Dependency Resolution Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libgeocoding.so.8()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: gdal-python-tools = 3.2.3 Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: geos = 3.9.2 Available: geos-3.4.2-2.el7.x86_64 (epel) geos = 3.4.2-2.el7 Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: gdal-devel = 3.2.3 Available: gdal-devel-1.11.4-3.el7.x86_64 (epel) gdal-devel = 1.11.4-3.el7 Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libpostal-data Error: Package: hootenanny-services-ui-0.2.70-1.el7.x86_64 (hoot-release) Requires: tomcat8 < 8.6.0 Error: Package: hootenanny-services-ui-0.2.70-1.el7.x86_64 (hoot-release) Requires: tomcat8 >= 8.5.0 Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: nodejs = 1:14.16.1 Installed: 2:nodejs-8.17.0-1nodesource.x86_64 (@nodesource) nodejs = 2:8.17.0-1nodesource Available: 1:nodejs-16.14.1-1.el7.x86_64 (epel) nodejs = 1:16.14.1-1.el7

This is despite installing the gdal 3.2.3 packages separately.

The section above this, Local RPMs, gives no indication of how to download and build the RPMS's, nor is there an accessible list of previous versions.

Moving on, we attempted to find a workaround by downloading a previously prepared vbox from Vagrant Cloud.

Bearing in mind, that while my colleague has a reasonably spec for his laptop, mine is a 5 year old machine with 8 GB of memory, which takes quite a while to build (hours), and I have to close most other apps.

The provided instructions here are also somewhat frustrating, but there are at least 2 issues.

A security change in github access from Sept 2021 has not been reflected in the configuration, causing the build to fail.

4 references to git:// need to be replaced with https://, in the imagery and suggestions area's.

It's also not made very clear that as the vbox setup uses port 8888, that the OAuth Configuration would need to be changed to match.

Despite all this, the software built, ran in under 4 Gb of memory, we both connected via our OSM accounts, and then failed in most things with the following error.

Error: signal 11: stack trace: hoot::SignalCatcher::default_handler(int) +0x2d /usr/lib64/libc.so.6 : ()+0x36400 std::_Sp_counted_ptr_inplace, (__gnu_cxx::_Lock_policy)2>::_M_dispose() +0xa

usually at around 29%

This issue will be logged in a separate case

To sum up, the virtualbox method needs a small number of config changes, and stand alone rpm installs fail. If you get past this stage, the s/w appears to fail.

A stable, well documented Docker image would be appreciated, with a few test cases to check we have a sucessful build.

Thanks

Mike

bmarchant commented 2 years ago

Mike it looks like our documentation for the RPMs is missing an additional RPM repo, which I've now added to the document:

yum-config-manager --add-repo https://geoint-deps.s3.amazonaws.com/el7/stable/geoint-deps.repo

Give that a go on your CentOS 7 box and that should hopefully resolve your installation issue.

marblerun commented 2 years ago

Sorry - no - no luck I'm afraid. Some are within a smidgeon - such as nodejs

Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: nodejs = 1:14.16.1 Installed: 2:nodejs-8.17.0-1nodesource.x86_64 (@nodesource) nodejs = 2:8.17.0-1nodesource Available: 1:nodejs-16.14.1-1.el7.x86_64 (epel) nodejs = 1:16.14.1-1.el7

but if anything, it seems worse.

I'll include the full install below - thanks for the quick response, by the way.

I'm wondering if we've done something in the other repo's, so I will double check.

In the meantime, I've removed the gdal packages I previously installed and used the ones from the new repo provided.

If I don't show the two long sections caused by nodejs, this is what we still have going wrong.

--> Finished Dependency Resolution Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libnode.so.83()(64bit) Error: Package: hootenanny-services-ui-0.2.70-1.el7.x86_64 (hoot-release) Requires: tomcat8 < 8.6.0 Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: liboauthcpp.so.0()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libpostal Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libgeocoding.so.8()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: nodejs = 1:14.16.1 Installed: 2:nodejs-8.17.0-1nodesource.x86_64 (@nodesource) nodejs = 2:8.17.0-1nodesource Available: 1:nodejs-16.14.1-1.el7.x86_64 (epel) nodejs = 1:16.14.1-1.el7

Error: Package: hootenanny-services-ui-0.2.70-1.el7.x86_64 (hoot-release) Requires: tomcat8 >= 8.5.0 Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: nodejs = 1:14.16.1 Installed: 2:nodejs-8.17.0-1nodesource.x86_64 (@nodesource) nodejs = 2:8.17.0-1nodesource Available: 1:nodejs-16.14.1-1.el7.x86_64 (epel) nodejs = 1:16.14.1-1.el7

Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libphonenumber = 8.12.27 Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libstxxl.so.1()(64bit) Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libpostal.so.1()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: glpk = 4.64 Available: glpk-4.52.1-2.el7.x86_64 (epel) glpk = 4.52.1-2.el7 Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libglpk.so.40()(64bit) Error: Package: hootenanny-core-0.2.70-1.el7.x86_64 (hoot-release) Requires: libphonenumber.so.8()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libpostal-data Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: stxxl = 1.3.1 Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: hoot-words Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: liboauthcpp = 0.1.0 You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest

Full output, pre gdal removal.

[root@vrdev ~]# yum-config-manager --add-repo https://geoint-deps.s3.amazonaws.com/el7/stable/geoint-deps.repo Loaded plugins: fastestmirror, langpacks Repository hoot-deps is listed more than once in the configuration adding repo from: https://geoint-deps.s3.amazonaws.com/el7/stable/geoint-deps.repo grabbing file https://geoint-deps.s3.amazonaws.com/el7/stable/geoint-deps.repo to /etc/yum.repos.d/geoint-deps.repo repo saved to /etc/yum.repos.d/geoint-deps.repo [root@vrdev ~]# yum install hootenanny-autostart Loaded plugins: changelog, fastestmirror, langpacks Repository hoot-deps is listed more than once in the configuration Loading mirror speeds from cached hostfile

marblerun commented 2 years ago

My apols, had to re-add the hoot-deps.repo.

Now, other than the strangeness with node, we are down to the following.

--> Finished Dependency Resolution Error: Package: libphonenumber-8.9.16-1.el7.x86_64 (hoot-deps) Requires: libprotobuf.so.8()(64bit) Available: protobuf-2.5.0-8.el7.x86_64 (base) libprotobuf.so.8()(64bit) Installing: protobuf-3.15.8-1.el7.x86_64 (geoint-deps) ~libprotobuf.so.26()(64bit) Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libphonenumber = 8.12.27 Installing: libphonenumber-8.9.16-1.el7.x86_64 (hoot-deps) libphonenumber = 8.9.16-1.el7 You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest

I suspect we could live with that, just not sure what's up with nodejs

marblerun commented 2 years ago

Final one for the night - I've tried to use nvm to install 14.16.1, which seems to have worked, but the subsequent hoot install doesn't seem to see it. [root@vrdev ~]# nvm ls -> v14.16.1 system default -> 14.16.1 (-> v14.16.1) node -> stable (-> v14.16.1) (default) stable -> 14.16 (-> v14.16.1) (default) iojs -> N/A (default) unstable -> N/A (default) lts/* -> lts/gallium (-> N/A) lts/argon -> v4.9.1 (-> N/A) lts/boron -> v6.17.1 (-> N/A) lts/carbon -> v8.17.0 (-> N/A) lts/dubnium -> v10.24.1 (-> N/A) lts/erbium -> v12.22.11 (-> N/A) lts/fermium -> v14.19.1 (-> N/A) lts/gallium -> v16.14.2 (-> N/A)

bmarchant commented 2 years ago

Node was a tricky thing for us. We actually had to compile our own version of it because it is distributed without dynamic libraries. The node executable has all of the libraries statically linked into it which wouldn't do for us since we need to dynamically link to the node and v8 libraries. That is why using nvm or installing from another source won't work.

If you look at the vagrant install script you'll see the list of steps to setup Hootenanny using the release RPMs.

marblerun commented 2 years ago

Hootenanny summary

Situation as I see it, please accept my apols if I'm wrong. I'm assuming we are using Centos 7

putting Docker aside, there are 2 current methods of installing the s/w, and two platform types to install on.

We can install via RPM, with a choice of a stable or current build. Information on previous versions not readily available, but could be helpful to point at a version with no dependency issues. There is a gap in the documentation at this point where you would like to be able to find the details of the last few releases.

The s/w can also be compiled from scratch.

Platform type can be a physical server, or Virtualbox VM running via Vagrant. Hoot boxes are available in Vagrant Cloud.

Of the 4 options available, install via RPM appears to fail on both platforms due to dependencies, primarily due to Nodejs. The Make route on Vbox succeeds after editing the config to correct a connection issue with github when building the UI, if I remember rightly. It is unclear however, if the build is totally correct, but the UI and auth sections seem OK.

We intend to try the Make route on our Physical server, to see if its possible.

I'd like to suggest that a clean install via both methods should be carried out, to see if there are any other steps missing from the documentation, and to see if the dependencies can be ironed out to allow an install via RPM.

Many thanks,

Mike

bmarchant commented 2 years ago

Mike:

The RPM installs should work on bare metal and via VM. With vagrant you can get the current release up and running with the following command found here:

NIGHTLY=no vagrant up hoot_centos7_rpm

We just tried it this morning and it came up successfully. I also created an "empty" CentOS7 VM in VirtualBox using a minimal install prebuilt VM. (Further down the page gives you instructions on how to setup the VM and the user/pass to login.) Once I setup the VM and started it up. I ran the following commands:

sudo yum install -y git yum-utils
cd ~
git clone https://github.com/ngageoint/hootenanny hoot
export HOOT_HOME=~/hoot
$HOOT_HOME/VagrantProvisionCentOS7Rpm.sh

There were no issues installing Hootenanny with either of those two options.

I suspect that your issue lies in the fact that you already have a version of NodeJS installed (nodejs-8.17.0-1nodesource.x86_64 from nodesource : from above). Like I said, Hootenanny requires a very specific version of NodeJS that is built with the shared libraries. This isn't found on nodesource (or anywhere else that I found out on the internet) and therefore we had to create our own. Older versions of NodeJS used to provide dynamic libraries but it has been years since they stopped. If you have other programs on the machine that require NodeJS, the version we provide will work as it provides the node executable with the libraries statically linked but with the dynamic ones also provided. Our export server uses NodeJS as an executable and it works correctly with our version.

marblerun commented 2 years ago

Hi,

I will try what you suggested. However, I've tried doing the RPM's after removing nodejs and clearing the cache, and run into the same issue, whichever version I try to install. (I've cracked the versioning with yum, and have tried to go back over 10 versions)

The software always wants a version that is close, but unavailable, even looking in the hoot-deps repo, where I would expect to find them, That goes for the versions built on 14.16.1 or 8.9.3. hoot-deps currently has 14.16.1-1, and 8.9.3-2 and thus fails.

If you happen to have a way round this, it would be nice to know. At one stage I had this down to just the 2 packages preventing the install, however, even if we work round nodejs, for later versions, this might also be an issue.

Error: Package: hootenanny-core-deps-0.2.70-1.el7.noarch (hoot-release) Requires: libphonenumber = 8.12.27 Installing: libphonenumber-8.9.16-1.el7.x86_64 (hoot-deps) libphonenumber = 8.9.16-1.el7

What might work best, is if you can point me in the right direction from this point.

I've cloned the software into a hoot subdir on the physical server using the command shown below, from your suggestion above.

git clone https://github.com/ngageoint/hootenanny hoot

which option do I take to proceed from this point ?

Thanks for you assistance so far. I'm going to take a backup of the Virtualbox vm, and have another go with that as well.

Mike

marblerun commented 2 years ago

Quick update

I took option 1 - cleared out my existing setup, git cloned using https + Windows options, and then built a vbox using vagrant up hoot_centos7_rpm. All the build dependencies were clear, but I have one issue.

How do you pass the NIGHTLY=no as an env variable when using Windows 10 ? I couldn't get the vagrant command to work with it as a prefix, and can see the effect from this line

hoot_centos7_rpm: ### Adding the Hoot nightly master repo ###

So built the box without it. We are going to repeat on the better spec'd laptop, but might redo once we know how, as the build process this way seems much more rapid.

As a by the by, this build brought down the following rpm's from hoot-deps with no problems

hoot_centos7_rpm:  nodejs                  x86_64 1:14.16.1-2.el7               hoot-deps    87 k
hoot_centos7_rpm:  nodejs-docs             noarch 1:14.16.1-2.el7               hoot-deps   7.9 M
hoot_centos7_rpm:  nodejs-libs             x86_64 1:14.16.1-2.el7               hoot-deps   331 M
hoot_centos7_rpm:  npm                     x86_64 1:6.14.12-1.14.16.1.2.el7     hoot-deps   3.9 M

The dependency list for nodejs is very small, I can provide it if it helps.

Mike

marblerun commented 2 years ago

More good news, I've played around with packages, repos and caches on our physical dev box, and finally cracked an install of the current master build.

The only issue appears to be with authorisation - which I fixed on virtualbox by altering the setup to point to port 8888. Here, port 8080 is in use, but the oauth request is failing to return - any suggestions ?

brianhatchl commented 2 years ago

I think using the nightly master RPMs is fine for your test environment. We use them in our ci pipeline.

For the oauthRedirectURL the default 8080 url is for our dev environment running the UI via webpack. Because you have the UI deployed via tomcat the url should be oauthRedirectURL=http://localhost:8080/hootenanny-id/login.html. If hoot is on a remote server, replace localhost with whatever IP or hostname you are using to reach it. `

marblerun commented 2 years ago

Many thanks - it's not our day I don't think. My colleague, who has the better machine, may have run into issues with Vbox and Windows 11. It's just not behaving the same way as my Windows 10 box, all Virtualbox issues but preventing progress.

I suspect my attempts to get this running on a real Centos 7 server are also doomed, at least remotely, due to having to access it from home through a vpn. That's also our problem, will drop an update if we can solve either issue. Replacing with an internal ip doesn't seem to work, not that I'm surprised.

Thanks

brianhatchl commented 2 years ago

Is your Hoot vm on your local host or is this a remote vm or bare metal server?

The web browser is using the oauthRedirectURL so the IP would need to be routable by it. Probably the same IP used to get to the Hoot UI landing page.

marblerun commented 2 years ago

Morning Brian,

The good news is that in the office, once I make the change from localhost to ip address, we get a working interface. That's probably not going to fly for our remote workers, but that's not your problem, and I can deal with that in either AWS or a hosted server.

However, any attempt to run hoot just fails completely. I was hoping to run some of the inbuilt tests, but they, and even hoot help, all fail as follows.

[root@vrdev hoot]# source ./SetupEnv.sh [root@vrdev hoot]# HootTest --case-only /usr/bin/HootEnv.sh: line 36: 4915 Segmentation fault (core dumped) "$@"

[root@vrdev hoot]# hoot help /usr/bin/HootEnv.sh: line 36: 5407 Segmentation fault (core dumped) "$@"

Is there something truly obvious that I'm missing here ?

Thanks

Mike

marblerun commented 2 years ago

on server restart, we get 6 of these in 4 seconds. Have to do some tweaks to get the crash dumps, if that would help.

Apr 1 12:08:43 vrdev kernel: hoot.bin[8802]: segfault at 7f8d255fbd00 ip 00007f8d1aa6ec10 sp 00007ffecc2c93a0 error 6 in libSFCGAL.so.1.3.1[7f8d1a683000+894000]

marblerun commented 2 years ago

I have crash dump info - what would you need to help - is this info or do you need everything?

reason: hoot.bin killed by SIGSEGV cmdline: hoot.bin info --tag-mergers --json executable: /usr/bin/hoot.bin package: hootenanny-core-0.2.71-0.7.20220330.b194be0.el7 component: hootenanny pid: 13072 pwd: /var/lib/hootenanny/userfiles/tmp hostname: vrdev.outdooractiveuk.local count: 1 abrt_version: 2.1.11 analyzer: CCpp architecture: x86_64 event_log: global_pid: 13072 kernel: 3.10.0-1160.59.1.el7.x86_64 last_occurrence: 1648812643 os_release: CentOS Linux release 7.9.2009 (Core) pkg_arch: x86_64 pkg_epoch: 0 pkg_name: hootenanny-core pkg_release: 0.7.20220330.b194be0.el7 pkg_vendor: (none) pkg_version: 0.2.71 runlevel: N 5 time: Fri 01 Apr 2022 12:30:43 BST type: CCpp uid: 91 username: tomcat uuid: f244c225c94a6057600642a281c6f14f6bbf8767

core_backtrace: :{ "signal": 11 :, "executable": "/usr/bin/hoot.bin" :, "stacktrace": : [ { "crash_thread": true : , "frames": : [ { "address": 139652714277904 : , "build_id": "e2b9ba593465bbb41df4920ade6250765dd251b1" : , "build_id_offset": 4111376 : , "function_name": "boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::init::init()" : } : , { "address": 139653047941571 : , "build_id_offset": 139653047941571 : } : , { "address": 140729546247129 : , "build_id_offset": 140729546247129 : } : , { "address": 3255258584118618368 : , "build_id_offset": 3255258584118618368 : } ] : } ] :} cgroup: :11:hugetlb:/ :10:cpuset:/ :9:perf_event:/ :8:cpuacct,cpu:/ :7:pids:/system.slice/tomcat8.service :6:memory:/ :5:blkio:/ :4:freezer:/ :3:net_prio,net_cls:/ :2:devices:/system.slice/tomcat8.service :1:name=systemd:/system.slice/tomcat8.service

brianhatchl commented 2 years ago

I realize now there might be a source of confusion in the hoot_centos7_rpm vm. I don't recall that the hoot source tree actually needs to be mounted in the shared directory /home/vagrant/hoot, but the rpm install location for hoot files is /var/lib/hootenanny.

So, the documented commands like source SetupEnv.sh should not be run as $HOOT_HOME should already be defined by the rpm as it's install location.

I was able to get the HootTest to run, but not without some fixes (which leads me to believe we don't run them from an rpm install often).

If you exit and re-ssh to the server you should confirm that $HOOT_HOME is already defined as /var/lib/hootenanny. If you cd to that directory, I found that due to hoot files being owned by root I needed to sudo HootTest --quick.

I saw an error about a missing josm jar that hoot uses for validation checks. We will need to fix this so that dependency is installed by the rpm. Until then, a manual fix that will leverage the full hoot source tree in the shared dir /home/vagrant/hoot is to run these commands:

sudo mkdir -p /var/lib/hootenanny/hoot-josm/target
sudo cp /home/vagrant/hoot/hoot-josm/target/hoot-josm.jar /var/lib/hootenanny/hoot-josm/target
sudo cp -R /home/vagrant/hoot/hoot-josm/target/dependency-jars/ /var/lib/hootenanny/hoot-josm/target

Then you should be able to run sudo HootTest --quick and sudo HootTest --case-only without error.

marblerun commented 2 years ago

Hi Brian,

I think we might be getting our lines crossed - so here goes.

To remove any issues with previous software installs, I've built a new Centos 7.9 server in AWS

Then I've followed the instructions give here https://github.com/ngageoint/hootenanny-rpms/blob/master/docs/install.md on how to do a release from the S3 rpms.

I may have confused both you and myself by doing a git clone beforehand.

The only additions necessary were the following.

yum-config-manager --add-repo https://geoint-deps.s3.amazonaws.com/el7/stable/geoint-deps.repo and yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

software then installed using

yum install -y hootenanny-autostart

giving me the following

[root@ip-172-31-22-59 conf]# rpm -qa |grep hoot hootenanny-autostart-0.2.70-1.el7.noarch hoot-words-1.0.1-1.el7.noarch hootenanny-core-deps-0.2.70-1.el7.noarch hootenanny-services-ui-0.2.70-1.el7.x86_64 hootenanny-core-0.2.70-1.el7.x86_64

I have no target directory for hoot-josm in the git cloned area, so the above fix doesn't fly for me.

I have the following services running

postgres 9302 1 0 15:19 ? 00:00:00 /usr/pgsql-13/bin/postmaster -D /var/lib/pgsql/13/data/ postgres 9303 9302 0 15:19 ? 00:00:00 postgres: logger postgres 9305 9302 0 15:19 ? 00:00:00 postgres: checkpointer postgres 9306 9302 0 15:19 ? 00:00:00 postgres: background writer postgres 9307 9302 0 15:19 ? 00:00:00 postgres: walwriter postgres 9308 9302 0 15:19 ? 00:00:00 postgres: stats collector postgres 9309 9302 0 15:19 ? 00:00:00 postgres: logical replication launcher root 9323 2 0 15:19 ? 00:00:00 [kworker/2:1] tomcat 9328 1 0 15:19 ? 00:00:00 npm tomcat 9348 1 2 15:19 ? 00:00:38 /usr/lib/jvm/jre-1.8.0/bin/java -Djava.awt.headless=true -Djava.security.egd=file:/dev/./urandom -Djdk.tls.ephemeralDHKeySize=2048 -Xms512M -Xmx20 tomcat 9416 9328 0 15:19 ? 00:00:00 node server.js postgres 9686 9302 0 15:19 ? 00:00:00 postgres: hoot hoot 127.0.0.1(33236) idle

I'm not sure that we didn't have a security group issue with Postgres, so I've fixed that, going to reboot and see where we get to. No Luck - HOOT_HOME is correctly defined, but HootTest --quick fails as before.

Fun isn't it. I might give up for the weekend after that. My colleague is trying out slightly older versions of vbox on W11, to see if they behave a bit better, will let you know how we get on. And no, they don't work either, apparently - just as vm's that is.

Mike

brianhatchl commented 2 years ago

Ok, so no vagrant in this setup.

We tried the steps you documented and indeed got a seg fault from the HootTest command. I'm not exactly sure what the culprit is, but our RPM install doc needed more updates to reflect our pgdg13 repo definition and also the addition of hoot-deps.

Give these install steps a try on a fresh image and we'll see what difference it makes.

marblerun commented 2 years ago

So, the bad news is - no apparent difference at all. Both Master and Release method crash while the server restarts. Installation and removal is now quite smooth, with the exception of requiring a couple of setsebool ops.

setsebool -P tomcat_can_network_connect_db on setsebool -P httpd_can_network_connect on

To allow DB connections etc. UI works ok, but not other parts. I'd noticed that the translators don't come up, amongst other things.

First sign of a problem comes here - from the messages file

Apr 5 07:31:53 ip-172-31-22-59 server: 05-Apr-2022 07:31:53.068 INFO [main] org.apache.catalina.startup.VersionLoggerListener.log Command line argument: -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager Apr 5 07:31:53 ip-172-31-22-59 server: 05-Apr-2022 07:31:53.069 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent The Apache Tomcat Native library which allows using OpenSSL was not found on the java.library.path: [/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64/jre/lib/amd64/server:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64/jre/lib/amd64:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64/jre/../lib/amd64::/usr/local/lib:/usr/lib/jvm/jre-1.8.0/lib/amd64/server:/var/lib/hootenanny/lib:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib]

First error here

Apr 5 07:32:02 ip-172-31-22-59 server: 2022-04-05 07:32:02,762 INFO ElementMergeServiceResource:122 - Element Merge Service started Apr 5 07:32:03 ip-172-31-22-59 server: 2022-04-05 07:32:03,170 INFO AdvancedConflationOptionsResource:178 - initializing hoot2 template Apr 5 07:32:03 ip-172-31-22-59 server: 2022-04-05 07:32:03,193 INFO ExternalCommandRunnerImpl:190 - Command hoot.bin info --tag-mergers --json started at: [2022-04-05T07:32:03.192] Apr 5 07:32:04 ip-172-31-22-59 kernel: hoot.bin[1543]: segfault at 7f0cec36cd00 ip 00007f0ce17dfc10 sp 00007ffc6ecfa4c0 error 6 in libSFCGAL.so.1.3.1[7f0ce13f4000+894000] Apr 5 07:32:04 ip-172-31-22-59 server: 2022-04-05 07:32:04,047 ERROR ExternalCommandRunnerImpl:246 - FAILURE of: CommandResult{command=[hoot.bin info --tag-mergers --json], jobId=[null], command_id=[null], caller=[hoot.services.controllers.conflation.AdvancedConflationOptionsResource], workingDir=[/var/lib/hootenanny/userfiles/tmp], start=[2022-04-05T07:32:03.192], finish=[2022-04-05T07:32:04.046], duration=[PT-0.854S], exitCode=[-1], stdout=[], stderr=[]} Apr 5 07:32:04 ip-172-31-22-59 server: org.apache.commons.exec.ExecuteException: Process exited with an error: 139 (Exit value: 139)

and then

Apr 5 07:32:04 ip-172-31-22-59 server: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Apr 5 07:32:04 ip-172-31-22-59 server: at java.lang.Thread.run(Thread.java:750) Apr 5 07:32:04 ip-172-31-22-59 server: 2022-04-05 07:32:04,061 INFO ExternalCommandRunnerImpl:190 - Command hoot.bin info --way-snap-criteria started at: [2022-04-05T07:32:04.061] Apr 5 07:32:04 ip-172-31-22-59 kernel: hoot.bin[1547]: segfault at 7f9ceabf2d00 ip 00007f9ce0065c10 sp 00007ffdff8383b0 error 6 in libSFCGAL.so.1.3.1[7f9cdfc7a000+894000] Apr 5 07:32:04 ip-172-31-22-59 server: 2022-04-05 07:32:04,280 ERROR ExternalCommandRunnerImpl:246 - FAILURE of: CommandResult{command=[hoot.bin info --way-snap-criteria], jobId=[null], command_id=[null], caller=[hoot.services.controllers.conflation.AdvancedConflationOptionsResource], workingDir=[/var/lib/hootenanny/userfiles/tmp], start=[2022-04-05T07:32:04.061], finish=[2022-04-05T07:32:04.280], duration=[PT-0.219S], exitCode=[-1], stdout=[], stderr=[]} Apr 5 07:32:04 ip-172-31-22-59 server: org.apache.commons.exec.ExecuteException: Process exited with an error: 139 (Exit value: 139)

In total, there are 2 --tag-mergers and 4 --way-snap-criteria errors on startup, and it all ends with a message saying Server Startup in 13636 ms.

Any attempt at a hoot test fails with a seg fault, as before.

as an fyi, from a clean C7 AWS install - plus update, 247 packages are installed by yum. As my last attempt was with the Master branch, these are the installed hootenanny packages.

Apr 04 20:42:46 Installed: hootenanny-core-deps-0.2.71-0.8.20220404.46a04b2.el7.noarch Apr 04 20:43:33 Installed: hootenanny-core-0.2.71-0.8.20220404.46a04b2.el7.x86_64 Apr 04 20:44:03 Installed: hootenanny-services-ui-0.2.71-0.8.20220404.46a04b2.el7.x86_64 Apr 04 20:44:03 Installed: hootenanny-autostart-0.2.71-0.8.20220404.46a04b2.el7.noarch

Suggestions appreciated.

Mike

brianhatchl commented 2 years ago

I'm not sure what the problem could be. Here is an ami with the rpms installed (and the missing josm jar included) ami-0fbdd0d27cdeba73d and it should be available in us-east-1.

In running HootTest --quick several times on different size instance types we did encounter one seg fault:

Error: signal 11:
stack trace:
  hoot::SignalCatcher::default_handler(int) +0x2d
  /usr/lib/jvm/jre-1.8.0/lib/amd64/server/libjvm.so : ()+0x9446a4
  /usr/lib/jvm/jre-1.8.0/lib/amd64/server/libjvm.so : JVM_handle_linux_signal()+0x1b1
  /usr/lib/jvm/jre-1.8.0/lib/amd64/server/libjvm.so : ()+0x93c388
  /usr/lib64/libpthread.so.0 : ()+0xf630
  hoot::BuildingPartPreMergeCollector::run() +0x11c
  /usr/lib64/libQt5Core.so.5 : ()+0xa5392
  /usr/lib64/libQt5Core.so.5 : ()+0xa7e71
  /usr/lib64/libpthread.so.0 : ()+0x7ea5
  /usr/lib64/libc.so.6 : clone()+0x6d
/usr/bin/HootEnv.sh: line 36:  7588 Segmentation fault      "$@"

and some warnings about tests running longer than expected Test N4hoot22ResolveReviewsOpMsTestE::runResolveMsTest ran longer than expected -- 4.72594

but not consistently.

Maybe give that ami a try.

marblerun commented 2 years ago

Gents,

Just a quick update that the provided ami worked really well, and the subsequent demo was well received. Have a fun weekend, and then perhaps we could talk outside the issue forum, but for now, many thanks for you kind assistance.

Cheers,

Mike