Open ryangooch opened 7 years ago
Based on conversation with @mdye last week, it appeared that the import config only crashed when data was already present in a portal. I completely removed and re-installed CHORDS on my machine, however, and it continues to break in the same way mentioned above.
I was able to import this configuration, see attached image.
When you removed CHORDS from your system, don't forget to remove the docker volumes containing the databases:
$ docker volume ls
DRIVER VOLUME NAME
...
local chords_grafana-data
local chords_influxdb-data
local chords_mysql-data
local chords_postfix-spool
...
$ docker volume rm chords_mysql-data chords_influxdb-data
If you have done this and it still crashes, then let's look at the log file during the import. Get CHORDS running, and then monitor the Rails log file while you are doing the imnport:
$ docker exec -it chords_app tail -f log/production.log
Thanks for the response.
So, I believe I've been fully deleting the Docker images/volumes/etc correctly. I just did it again, monitoring the Rails log file as suggested, and it works, until it just doesn't. There's no error message thrown in the log file, but I get a "This page isn't working. localhost unexpectedly closed the connection. ERR_CONTENT_LENGTH_MISMATCH" error in browser after a minute or so.
I can attach log files if that would be helpful.
I was looking at other logs while doing this process and I found an error thrown at the time the import crashes in /var/log/nginx/error.log.
The exact wording:
2017/06/12 21:39:56 [error] 27#0: *356 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 172.19.0.1, server: localhost, request: "POST /profiles/import_configuration HTTP/1.1", upstream: "http://unix:/tmp/unicorn.chords.sock/profiles/import_configuration", host: "localhost", referrer: "http://localhost/profiles/import_configuration"
2017/06/12 21:39:56 [error] 27#0: *356 readv() failed (104: Connection reset by peer) while reading upstream, client: 172.19.0.1, server: localhost, request: "POST /profiles/import_configuration HTTP/1.1", upstream: "http://unix:/tmp/unicorn.chords.sock:/500.html", host: "localhost", referrer: "http://localhost/profiles/import_configuration"
Good sleuthing! Pretty sure it is saying that the configuration import is going slowly in Rails.
Here is a short article about nginx 110 timeout errors. The recommended solution is to optimize the Rails action that is going slowly. We should take a close look at the configuration import code.
What kind of machine are you running this on? We have found recently that on Terrie's mac-mini, mysql transactions are happening at an inexplicably slow rate. We need to track that down. Perhaps the same thing is happening with your system.
Thanks for the response!
I'm running this on Linux (Mint 18). Interestingly, when I tried it in a virtual machine with a fresh Linux Mint 18 install it worked perfectly. While this may still be a deeper issue, would you let me know if my process for completely removing all traces of the Docker CHORDS information is correct?
Bring the CHORDS containers down
$ docker-compose -p chords down
Remove Docker images
$ docker rmi $(docker images -a)
Remove Docker volumes
$ docker volume rm $(docker volume ls -q)
Then download and start the portals by the same process outlined in Running CHORDS:
curl -O https://raw.githubusercontent.com/NCAR/chords/master/chords_control
python chords_control --config
python chords_control --update
python chords_control --run
All those steps are correct.
You should probably also remove all the containers before removing the images :
docker rm $(docker ps -qa)
My intuition still tells me that it is a mysql performance problem. When you watched production.log, you should see the mysql transactions scrolling by quite quickly.
When we saw a similar problem on the mac-mini, the transactions appeared every few seconds, and some even seemed to be dropped. One difference was that the problem showed up when seeding the mysql database with the ontology. But both of these problems are similar in that the Rails app may be bogged down in many database writes.
The SQL transaction times are listed in the log. On my system (MacBook Pro), each one uses less than a ms. (I don't know why each transaction is listed twice, with different times):
D, [2017-06-13T19:06:20.237763 #63] DEBUG -- : (0.2ms) BEGIN
D, [2017-06-13T19:06:20.238786 #63] DEBUG -- : SQL (0.4ms) INSERT INTO `vars` (`instrument_id`, `name`, `measured_property_id`, `units`, `shortname`, `id`, `created_at`, `updated_at`) VALUES (79, 'CASA Differential Phase', 316, 'deg', 'phidp', 2443, '2017-06-13 19:06:20', '2017-06-13 19:06:20')
D, [2017-06-13T19:06:20.240130 #63] DEBUG -- : (0.9ms) COMMIT
D, [2017-06-13T19:06:20.240941 #63] DEBUG -- : (0.2ms) BEGIN
D, [2017-06-13T19:06:20.241927 #63] DEBUG -- : SQL (0.3ms) INSERT INTO `vars` (`instrument_id`, `name`, `measured_property_id`, `units`, `shortname`, `id`, `created_at`, `updated_at`) VALUES (79, 'CASA Specific Differential Phase', 316, 'deg/km', 'kdp', 2444, '2017-06-13 19:06:20', '2017-06-13 19:06:20')
D, [2017-06-13T19:06:20.242884 #63] DEBUG -- : (0.7ms) COMMIT
Ryan reported the portal is still crashing upon configuration import after the update to v 0.9.0
The log files on csu-ahil report the error as:
JSON::ParserError (A JSON text must at least contain two octets!): app/controllers/profiles_controller.rb:69:in
import_configuration'`
Ryan, could you upload the .json configuration file you are attempting to import (or is it the same one as before?)
I created a gist for it here: import_config_dfw_2017_07_06.json. You may want to check the latest error message though, I found I had messed something up with the code I generate these with. However, with the correct file, the portal still crashes upon import.
Thanks Ryan. YEs, there is a different error message now. I'm looking into the cause.
It may have something to do with the logo image (or the lack of one). I reworked the image handling awhile back to deal with errors like this. Try resetting the logo image (so there is none), and try the export/import.
Along these lines, if the configuration contains any binary data fields, this might be causing the problem. Translation into/from JSON requires special handling in this case.
On Thu, Jul 6, 2017 at 5:05 PM, Mike Dye notifications@github.com wrote:
Thanks Ryan. YEs, there is a different error message now. I'm looking into the cause.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NCAR/chords/issues/241#issuecomment-313542973, or mute the thread https://github.com/notifications/unsubscribe-auth/ABn9viKFX9LIlEZSzzo7bnqC7839As-Gks5sLWhRgaJpZM4NttT2 .
I think I figures out at least a part of what's going on here.
The "import configuration" was implemented assuming that the entire configuration for a portal would be imported - including the profile configuration, sites, and measured properties.
The dfw config files looks like it has definitions for instruments and vars, but not sites, profile or measured properties.
I'm looking at modifying the existing import so that only certain model types (like instruments + vars) can be imported.
After clicking "Import" to import a configuration file that I generated (using an exported configuration file as a template), the portal throws the following error:
The portal configuration file seems to be the correct format, since about 20 instruments are configured correctly, while the rest have been generated but only assigned names, not variables. It appears that it was working properly, then simply broke.