earthcubeprojects-chords / chords

EarthCube CHORDS application code
GNU General Public License v2.0
25 stars 13 forks source link

Portal crashes when importing config file (Development) #241

Open ryangooch opened 7 years ago

ryangooch commented 7 years ago

After clicking "Import" to import a configuration file that I generated (using an exported configuration file as a template), the portal throws the following error:

This page isn’t working

localhost unexpectedly closed the connection. ERR_CONTENT_LENGTH_MISMATCH

The portal configuration file seems to be the correct format, since about 20 instruments are configured correctly, while the rest have been generated but only assigned names, not variables. It appears that it was working properly, then simply broke.

ryangooch commented 7 years ago

The import config file can be viewed here if needed

ryangooch commented 7 years ago

Based on conversation with @mdye last week, it appeared that the import config only crashed when data was already present in a portal. I completely removed and re-installed CHORDS on my machine, however, and it continues to break in the same way mentioned above.

MisterMartin commented 7 years ago

I was able to import this configuration, see attached image.

When you removed CHORDS from your system, don't forget to remove the docker volumes containing the databases:

$ docker volume ls
DRIVER              VOLUME NAME
...
local               chords_grafana-data
local               chords_influxdb-data
local               chords_mysql-data
local               chords_postfix-spool
...
$ docker volume rm chords_mysql-data chords_influxdb-data

If you have done this and it still crashes, then let's look at the log file during the import. Get CHORDS running, and then monitor the Rails log file while you are doing the imnport:

$ docker exec -it chords_app tail -f log/production.log

screen shot 2017-06-12 at 2 07 21 pm

ryangooch commented 7 years ago

Thanks for the response.

So, I believe I've been fully deleting the Docker images/volumes/etc correctly. I just did it again, monitoring the Rails log file as suggested, and it works, until it just doesn't. There's no error message thrown in the log file, but I get a "This page isn't working. localhost unexpectedly closed the connection. ERR_CONTENT_LENGTH_MISMATCH" error in browser after a minute or so.

I can attach log files if that would be helpful.

ryangooch commented 7 years ago

I was looking at other logs while doing this process and I found an error thrown at the time the import crashes in /var/log/nginx/error.log.

The exact wording:

2017/06/12 21:39:56 [error] 27#0: *356 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.19.0.1, server: localhost, request: "POST /profiles/import_configuration HTTP/1.1", upstream: "http://unix:/tmp/unicorn.chords.sock/profiles/import_configuration", host: "localhost", referrer: "http://localhost/profiles/import_configuration"

2017/06/12 21:39:56 [error] 27#0: *356 readv() failed (104: Connection reset by peer) while reading upstream, client: 172.19.0.1, server: localhost, request: "POST /profiles/import_configuration HTTP/1.1", upstream: "http://unix:/tmp/unicorn.chords.sock:/500.html", host: "localhost", referrer: "http://localhost/profiles/import_configuration"

MisterMartin commented 7 years ago

Good sleuthing! Pretty sure it is saying that the configuration import is going slowly in Rails.

Here is a short article about nginx 110 timeout errors. The recommended solution is to optimize the Rails action that is going slowly. We should take a close look at the configuration import code.

What kind of machine are you running this on? We have found recently that on Terrie's mac-mini, mysql transactions are happening at an inexplicably slow rate. We need to track that down. Perhaps the same thing is happening with your system.

ryangooch commented 7 years ago

Thanks for the response!

I'm running this on Linux (Mint 18). Interestingly, when I tried it in a virtual machine with a fresh Linux Mint 18 install it worked perfectly. While this may still be a deeper issue, would you let me know if my process for completely removing all traces of the Docker CHORDS information is correct?

  1. Bring the CHORDS containers down $ docker-compose -p chords down

  2. Remove Docker images $ docker rmi $(docker images -a)

  3. Remove Docker volumes $ docker volume rm $(docker volume ls -q)

  4. Then download and start the portals by the same process outlined in Running CHORDS:

    
    curl -O  https://raw.githubusercontent.com/NCAR/chords/master/chords_control

Create configuration files (chords.yml and .env will be created)

(Be prepared to enter an admin password of your choosing.)

python chords_control --config

Update the CHORDS software

python chords_control --update

python chords_control --run

mdye commented 7 years ago

All those steps are correct.

You should probably also remove all the containers before removing the images :

docker rm $(docker ps -qa)

MisterMartin commented 7 years ago

My intuition still tells me that it is a mysql performance problem. When you watched production.log, you should see the mysql transactions scrolling by quite quickly.

When we saw a similar problem on the mac-mini, the transactions appeared every few seconds, and some even seemed to be dropped. One difference was that the problem showed up when seeding the mysql database with the ontology. But both of these problems are similar in that the Rails app may be bogged down in many database writes.

The SQL transaction times are listed in the log. On my system (MacBook Pro), each one uses less than a ms. (I don't know why each transaction is listed twice, with different times):

D, [2017-06-13T19:06:20.237763 #63] DEBUG -- :    (0.2ms)  BEGIN
D, [2017-06-13T19:06:20.238786 #63] DEBUG -- :   SQL (0.4ms)  INSERT INTO `vars` (`instrument_id`, `name`, `measured_property_id`, `units`, `shortname`, `id`, `created_at`, `updated_at`) VALUES (79, 'CASA Differential Phase', 316, 'deg', 'phidp', 2443, '2017-06-13 19:06:20', '2017-06-13 19:06:20')
D, [2017-06-13T19:06:20.240130 #63] DEBUG -- :    (0.9ms)  COMMIT
D, [2017-06-13T19:06:20.240941 #63] DEBUG -- :    (0.2ms)  BEGIN
D, [2017-06-13T19:06:20.241927 #63] DEBUG -- :   SQL (0.3ms)  INSERT INTO `vars` (`instrument_id`, `name`, `measured_property_id`, `units`, `shortname`, `id`, `created_at`, `updated_at`) VALUES (79, 'CASA Specific Differential Phase', 316, 'deg/km', 'kdp', 2444, '2017-06-13 19:06:20', '2017-06-13 19:06:20')
D, [2017-06-13T19:06:20.242884 #63] DEBUG -- :    (0.7ms)  COMMIT
mdye commented 7 years ago

Ryan reported the portal is still crashing upon configuration import after the update to v 0.9.0

mdye commented 7 years ago

The log files on csu-ahil report the error as:

JSON::ParserError (A JSON text must at least contain two octets!): app/controllers/profiles_controller.rb:69:inimport_configuration'`

mdye commented 7 years ago

Ryan, could you upload the .json configuration file you are attempting to import (or is it the same one as before?)

ryangooch commented 7 years ago

I created a gist for it here: import_config_dfw_2017_07_06.json. You may want to check the latest error message though, I found I had messed something up with the code I generate these with. However, with the correct file, the portal still crashes upon import.

mdye commented 7 years ago

Thanks Ryan. YEs, there is a different error message now. I'm looking into the cause.

MisterMartin commented 7 years ago

It may have something to do with the logo image (or the lack of one). I reworked the image handling awhile back to deal with errors like this. Try resetting the logo image (so there is none), and try the export/import.

Along these lines, if the configuration contains any binary data fields, this might be causing the problem. Translation into/from JSON requires special handling in this case.

On Thu, Jul 6, 2017 at 5:05 PM, Mike Dye notifications@github.com wrote:

Thanks Ryan. YEs, there is a different error message now. I'm looking into the cause.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NCAR/chords/issues/241#issuecomment-313542973, or mute the thread https://github.com/notifications/unsubscribe-auth/ABn9viKFX9LIlEZSzzo7bnqC7839As-Gks5sLWhRgaJpZM4NttT2 .

mdye commented 7 years ago

I think I figures out at least a part of what's going on here.

The "import configuration" was implemented assuming that the entire configuration for a portal would be imported - including the profile configuration, sites, and measured properties.

The dfw config files looks like it has definitions for instruments and vars, but not sites, profile or measured properties.

I'm looking at modifying the existing import so that only certain model types (like instruments + vars) can be imported.