sakaicontrib / docker-sakai-builder

Some docker commands to get Sakai build and running 100% in docker
6 stars 7 forks source link

About 1/2 way through tomcat start up the mysql container croaks and then tomcat goes off the rails #5

Closed csev closed 3 years ago

csev commented 3 years ago

This is the last gasp of the mysql server:

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (7f7a6c007d90): SELECT TABLE_SCHEMA TABLE_CAT, NULL TABLE_SCHEM, TABLE_NAME, NON_UNIQUE, TABLE_SCHEMA INDEX_QUALIFIER, INDEX_NAME, 3 TYPE, SEQ_IN_INDEX ORDINAL_POSITION, COLUMN_NAME, COLLATION ASC_OR_DESC, CARDINALITY, NULL PAGES, NULL FILTER_CONDITION FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA='sakai' AND TABLE_NAME = 'CM_MEMBER_CONTAINER_T' ORDER BY NON_UNIQUE, TYPE, INDEX_NAME, ORDINAL_POSITION

Connection ID (thread ID): 3

Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what is causing the crash.

This is the moment where tomcat starts to wander off the rails - I assume the Mysql server just went belly up:

30-Jul-2021 15:25:12.824 INFO [main] org.sakaiproject.springframework.orm.hibernate.impl.AdditionalHibernateMappingsImpl.processAdditionalUnit Add mapping resource [org/sakaiproject/sitestats/impl/hbm/ServerStat.hbm.xml]
30-Jul-2021 15:25:12.826 INFO [main] org.sakaiproject.springframework.orm.hibernate.impl.AdditionalHibernateMappingsImpl.processAdditionalUnit Add mapping resource [org/sakaiproject/sitestats/impl/hbm/UserStat.hbm.xml]
30-Jul-2021 15:25:12.989 INFO [main] org.sakaiproject.ignite.IgniteEntityManagerFactoryBean.setIgnite Ignite instance name [localhost] configured as hibernate cache provider
30-Jul-2021 15:25:25.796 WARN [main] com.zaxxer.hikari.pool.ProxyConnection.checkException sakai - Connection com.p6spy.engine.wrapper.ConnectionWrapper@51285b0 marked as broken because of SQLSTATE(08000), ErrorCode(0)
java.sql.SQLNonTransientConnectionException: (conn=3) unexpected end of stream, read 0 bytes from 4 (socket was closed by server)
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:73)
    at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:155)
    at org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue(MariaDbStatement.java:274)

If you restart the mysql server and look at it - there are lots of tables - already built before it crashes:

MacBook-Pro-101:docker-sakai-builder csev$ mysql -h 127.0.0.1 -P 53306 -u root -p
Enter password: 

mysql> use sakai;
mysql> show tables;
+--------------------------------+
| Tables_in_sakai                |
+--------------------------------+
| ANNOUNCEMENT_CHANNEL           |
| ANNOUNCEMENT_MESSAGE           |
| ASN_AP_ITEM_ACCESS_T           |
| ASN_AP_ITEM_T                  |

I don't think the mysql server is running out of disk...

$ df
Filesystem     1K-blocks      Used Available Use% Mounted on
overlay         61255492  12837084  45277084  23% /
tmpfs              65536         0     65536   0% /dev
tmpfs            2015396         0   2015396   0% /sys/fs/cgroup
shm                65536         0     65536   0% /dev/shm
grpcfuse       976900220 884947804  89111412  91% /var/lib/mysql
/dev/vda1       61255492  12837084  45277084  23% /etc/hosts
tmpfs            2015396         0   2015396   0% /proc/acpi
tmpfs            2015396         0   2015396   0% /sys/firmware
jonespm commented 3 years ago

I've never actually seen this error.

How much memory are you giving to the Docker engine? I have one machine configured for 8GB but I think the other for 4GB.

If you run docker stats does it still show both containers running?

I got a weird startup error once and the only thing I changed was the mysql version.

Also try to clean it out and start again ./sakai-dock.sh clean_mysql and might also push back the mysql version in the sakai-dock.sh to be more specific instead of just 5.7. I don't think I've ran this in at least a few weeks so it's possible it could be something new. I also realized I'm not using the --pull always parameter here so it's been just using an older version of MySQL that I have.

-d mysql:5.7.33 || docker start sakai-mysql

jonespm commented 3 years ago

I really should also try switching this over to https://hub.docker.com/_/mariadb instead.

csev commented 3 years ago

I am giving the engine 4GB as you suggested in your documentation. I will try a clean and then run through the starts.

csev commented 3 years ago

I attach a mysql and tomcat log.

mysql.txt tomcat.txt

Some anomalies are:

Mysql seemed to start a server, shut it down and then restart

2021-07-30 23:10:48+00:00 [Note] [Entrypoint]: Temporary server started. Warning: Unable to load '/usr/share/zoneinfo/iso3166.tab' as time zone. Skipping it. Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. Warning: Unable to load '/usr/share/zoneinfo/zone.tab' as time zone. Skipping it. Warning: Unable to load '/usr/share/zoneinfo/zone1970.tab' as time zone. Skipping it.

2021-07-30 23:10:52+00:00 [Note] [Entrypoint]: /usr/local/bin/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/sakai.sql

2021-07-30 23:10:52+00:00 [Note] [Entrypoint]: Stopping temporary server 2021-07-30T23:10:52.442066Z 0 [Note] Giving 0 client threads a chance to die gracefully 2021-07-30T23:10:52.442175Z 0 [Note] Shutting down slave threads 2021-07-30T23:10:52.442203Z 0 [Note] Forcefully disconnecting 0 remaining clients 2021-07-30T23:10:52.442230Z 0 [Note] Event Scheduler: Purging the queue. 0 events 2021-07-30T23:10:52.442376Z 0 [Note] Binlog end 2021-07-30T23:10:52.445741Z 0 [Note] Shutting down plugin 'ngram' 2021-07-30T23:10:52.445754Z 0 [Note] Shutting down plugin 'partition' 2021-07-30T23:10:52.445761Z 0 [Note] Shutting down plugin 'BLACKHOLE' 2021-07-30T23:10:52.445767Z 0 [Note] Shutting down plugin 'ARCHIVE' 2021-07-30T23:10:52.445775Z 0 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA' 2021-07-30T23:10:52.445820Z 0 [Note] Shutting down plugin 'MRG_MYISAM' 2021-07-30T23:10:52.445924Z 0 [Note] Shutting down plugin 'MyISAM' 2021-07-30T23:10:52.445955Z 0 [Note] Shutting down plugin 'INNODB_SYS_VIRTUAL' 2021-07-30T23:10:52.445965Z 0 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'

Then mysql complained about something wrong with innodb indexes:

2021-07-30T23:19:15.453834Z 2 [ERROR] InnoDB: Table sakai/asn_submission_submitter contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:15.568269Z 2 [ERROR] InnoDB: Table sakai/asn_sup_attach_t contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:15.680746Z 2 [ERROR] InnoDB: Table sakai/bullhorn_alerts contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:15.895562Z 2 [ERROR] InnoDB: Table sakai/cm_enrollment_set_t contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:16.014909Z 2 [ERROR] InnoDB: Table sakai/cm_enrollment_t contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:16.072825Z 2 [ERROR] InnoDB: Table sakai/cm_enrollment_t contains 4 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 2021-07-30T23:19:16.200105Z 2 [ERROR] InnoDB: Table sakai/cm_member_container_t contains 3 indexes inside InnoDB, which is different from the number of indexes 2 defined in MySQL 20

Then got a lot of these in mysql - one per query - at the same time Sakai was barfing (see below)

2021-07-30T23:27:58.637512Z 2 [ERROR] InnoDB could not find key no 1 with name SITE_ID from dict cache for table sakai/sakai_site_page_property 2021-07-30T23:27:58.637527Z 2 [Warning] InnoDB: change_active_index(1) failed 2021-07-30T23:27:58.637540Z 2 [ERROR] Got error 1 when reading table './sakai/sakai_site_page_property' 2021-07-30T23:27:58.662600Z 2 [ERROR] InnoDB could not find key no 1 with name SITE_ID from dict cache for table sakai/sakai_site_group 2021-07-30T23:27:58.676080Z 2 [ERROR] InnoDB could not find key no 1 with name SITE_ID from dict cache for table sakai/sakai_site_tool_property 2021-07-30T23:27:58.676155Z 2 [ERROR] InnoDB could not find key no 1 with name SITE_ID from dict cache for table sakai/sakai_site_tool_property 202

Sakai sql errors:

30-Jul-2021 16:26:19.308 INFO [main] org.sakaiproject.memory.impl.EhcacheMemoryService.makeEhcache Created ehcache (org.sakaiproject.db.BaseDbFlatStorage.SAKAI_SITE_PAGE_PROPERTY) using defaults 30-Jul-2021 16:26:19.536 WARN [main] org.sakaiproject.db.impl.BasicSqlService.dbRead Sql.dbRead: sql: select PAGE_ID, NAME, VALUE from SAKAI_SITE_PAGE_PROPERTY where ( SITE_ID = ? ) DAC-MUSIC-DEPT1-SUBJ1-101 java.sql.SQLTransientConnectionException: (conn=2) Got error 1 from storage engine at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:79)

csev commented 3 years ago

It is not clear in my most recent test that anything crashed - memory and CPU was pretty reasonable:

CONTAINER ID   NAME           CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O       PIDS
bd8f3d979d67   sakai-mysql    0.14%     192.4MiB / 3.844GiB   4.89%     15.5kB / 11.9kB   15MB / 8.19kB   37
f567f6280aab   sakai-tomcat   238.97%   875.5MiB / 3.844GiB   22.24%    12.8kB / 14.6kB   913kB / 0B      87

But it just ran for a long time with both tomcat and mysql spewing error after error on pretty much every SQL query - Sakai did not come up successfully.

jonespm commented 3 years ago

I was searching those errors and the only thing I can see is they might have been caused by some other errors. It stores the files in work/mysql/data locally.

I'd try to ./sakai-dock.sh kill ./sakai-dock.sh clean_mysql which should remove that directory. Then afterward start MySQL and tomcat.

It might also be a permission thing but I thought I had that worked out. I'll try to rebuild through it myself when I make these updates. It looks like everything should eb okay though if there weren't any of those errors.

csev commented 3 years ago

Strangely - after 31 minutes and 1000's of errors from both Tomcat/Sakai and MySql, Sakai actually came up on port 8080. I did not test it too much :)

jonespm commented 3 years ago

It should come up a lot faster than that. My primary build machine is Windows (i7-8750h) with WSL2 and it does a full build in about 6 minutes and starts up the first time (where it has to DDL) in about 5 minutes then every other time about 2-3.

I just tried on a 2017 MBP which is just slightly slower as far as CPU and it took over 30 minutes!

I remember reading about how slow Docker was on OSX which is really visible in Sakai. Docker for OSX has felt about as slow as Docker was on WSL1 (super slow).

I remember locally we did set those mount options to get at least 2x better performance, but going beyond that on OSX requires probably some of these other solutions, like NFS or that Mutagen software. It could also probably just build it all in the container, maybe caching the m2 artifacts, and copy between them. But this might make it a little tricker to do partial builds.

I'll at least modify the script to use :delegated and maybe :cached and give it another try but even 10-15 minutes still seems too slow.

jonespm commented 3 years ago

It's also going to be a lot faster after the first run when the m2 and npm artifacts are downloaded and cached. Though I'm seeing significant performance boost in setting :delegated and :cached. I forgot these were important for OSX.

csev commented 3 years ago

Cool - Thanks for taking the time - I want to use this to onboard new devs instead of my stuff. So if you and I are both customers and both happy - that bodes well.

jonespm commented 3 years ago

I got some errors at the very end of the build when building webcomponents so haven't got it started yet. I'll have some time tomorrow to look at this more. I feel like if it's a little slower than Windows on OSX that's okay. It should only be really slow for the first build though.

It looks like some of these file mount options are no longer available when using gRPC FUSE. I'm going to try to turn it off and also try Mutagen.

It still runs quite slow for me. :/

jonespm commented 3 years ago

I've got it started up by updating the MySQL jdbc driver to 8.0.26 but it still is shutting down now at this point for me.

31-Jul-2021 15:03:17.122 INFO [main] org.sakaiproject.coursemanagement.impl.SampleDataLoader.init start
31-Jul-2021 15:03:17.232 WARN [main] org.sakaiproject.util.SakaiApplicationContext.refresh Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'org.sakaiproject.coursemanagement.impl.CourseManagementSampleDataLoader' defined in file [/usr/src/app/deploy/components/coursemanagement-hibernate-pack/WEB-INF/components-demo.xml]: Invocation of init method failed; nested exception is java.lang.NullPointerException

I also made a commit to the script the use :cached and :delegated but these only work when Use gRPC fuse is unchecked. Doing this gets it to both build and start up about twice as fast and I'd probably recommend that.

jonespm commented 3 years ago

Okay, I'm getting all of these same errors as you regardless of the jdbc version. :/

I'm thinking of just trying to switch over to maria to see if that works. I have no idea what these mean.

2021-07-31T20:08:29.760056Z 2 [ERROR] InnoDB: Table sakai/mfr_label_t contains 2 indexes inside InnoDB, which is different from the number of indexes 1 defined in MySQL
2021-07-31T20:08:29.817483Z 2 [ERROR] InnoDB: Table sakai/mfr_label_t contains 3 indexes inside InnoDB, which is different from the number of indexes 1 defined in MySQL
2021-07-31T20:08:29.862178Z 2 [ERROR] InnoDB: Table sakai/mfr_membership_item_t contains 2 indexes inside InnoDB, which is different from the number of indexes 1 defined in MySQL
2021-07-31T20:08:29.911144Z 2 [ERROR] InnoDB: Table sakai/mfr_membership_item_t contains 3 indexes inside InnoDB, which is different from the number of indexes 1 defined in MySQL
2021-07-31T20:08:29.953577Z 2 [ERROR] InnoDB: Table sakai/mfr_membership_item_t contains 4 indexes inside InnoDB, which is different from the number of indexes 1 defined in MySQL
csev commented 3 years ago

Thanks for the progress.

jonespm commented 3 years ago

Okay, I pushed a change to the shell script to switch MySQL over to MariaDB. You'll have to clean the database directory with ./sakai-dock.sh clean_data. (I updated the README)

Good news is that starts up with no errors now on either tomcat or Maria. There's still some MySQL names left around that I should probably cleanup. The first-time startup without gRPC Fuse does take about 15 minutes INFO: Server startup in [1009591] milliseconds. I'm guessing it'd be about 30 with that option.

Startups after the DDL are run and wars are exploded are a lot faster. After it came up I restarted it and it took 3.5 minutes INFO: Server startup in [211726] milliseconds

First time startup could possibly be made faster too, as it is a lot faster on Windows. Another idea could be to run the database (probably MariaDB) installed on the localhost rather than in a container. I believe it can use host.docker.internal to get to the localhost network now, which makes that easier. This should also be faster on OSX.

jonespm commented 3 years ago

I'm not seeing this issue anymore with the switch to mariadb. Let me know if you're still seeing it.

Also if you uncheck the "Use gRPC Fuse" your builds should be faster. But this disk option is just a known issue with Docker on OSX at the moment. Hopefully they'll support Mutagen as part of Docker and something I'd like to look at but I don't think it's related to this.