indigo-dc / onedata

Indigo mirror of http://github.com/onedata/onedata
Apache License 2.0
1 stars 1 forks source link

RC5 hangs on startup. #2

Closed bwegh closed 7 years ago

bwegh commented 7 years ago

Similar like #1 yet this time the certificate is again the default certificate of cowboy, which can be found on github including the key.

I ran the openprovider again with my script and here is the output

groundnuty commented 7 years ago

@bwegh As your log shows the startup fails at:

[36moneprovider_kit                | * service_onepanel: add_users
oneprovider_kit                | Error: Service Error
oneprovider_kit                | Description: Action 'deploy' for a service 'oneprovider' terminated with an error.
oneprovider_kit                | Module: service_onepanel
oneprovider_kit                | Function: add_users
oneprovider_kit                | Hosts: node1.oneprovider.localhost
oneprovider_kit                | For more information please check the logs.

adding users. This is due to a unfortunate bug in rc5 (which I am told will be fixed in rc6). Please change in your configuration the names of users admin and user to admin2 and user2. A part form your log with changes needed:

70,71c62,63
<             "admin2":
<               password: "yourpass"
---
>             "admin1":
>               password: "yourpass"
73,74c65,66
<             "user2":
<               password: "yourpass"
---
>             "user1":
>               password: "yourpass"

Please give us a feedback if that solved the problem.

bwegh commented 7 years ago

I tried, and users were added, certificates seemed to be loaded, but it hung for more than 10 minutes at service_op_worker: wait_for_init

complete output:

Creating oneprovider_kit
Attaching to oneprovider_kit
oneprovider_kit                | Starting op_panel: [  OK  ]
oneprovider_kit                | 
oneprovider_kit                | Configuring oneprovider:
oneprovider_kit                | * service_onepanel: set_cookie
oneprovider_kit                | * service_onepanel: purge_node
oneprovider_kit                | * service_onepanel: create_tables
oneprovider_kit                | * service_onepanel: add_default_users
oneprovider_kit                | * service_onepanel: add_nodes
oneprovider_kit                | * service: save
oneprovider_kit                | * service_couchbase: configure
oneprovider_kit                | * service_couchbase: start
oneprovider_kit                | * service_couchbase: wait_for_init
oneprovider_kit                | * service_couchbase: init_cluster
oneprovider_kit                | * service_couchbase: rebalance_cluster
oneprovider_kit                | * service_couchbase: status
oneprovider_kit                | * service: save
oneprovider_kit                | * service_cluster_manager: configure
oneprovider_kit                | * service_cluster_manager: start
oneprovider_kit                | * service_cluster_manager: status
oneprovider_kit                | * service: save
oneprovider_kit                | * service_op_worker: configure
oneprovider_kit                | * service_op_worker: setup_certs
oneprovider_kit                | * service_op_worker: start
oneprovider_kit                | * service_op_worker: wait_for_init

I will just wait for RC6 and its documentation ...

groundnuty commented 7 years ago

@bwegh I'm glad to hear that users and certs were added successfully. What kind of machine you are running it on? I doubt that rc6 will solve this issue you are experiencing, since it's the first time I see it and I do test every rc against every scenario in getting-started prior to release manually (human tester apart from integration tests that we have :) ). Cloud you please send us the content of /home/bas/.config/onedata/persistence/var/log/ ?

Michal

bwegh commented 7 years ago

ran it again, waited about 50 minutes at 'configure oneprovider' oneprovider_log.txt

op_worker.zip

groundnuty commented 7 years ago

we need entire /var/log, not only op_worker. could you pack in and attach it here?

bwegh commented 7 years ago

here you go ... oneprovider.zip

groundnuty commented 7 years ago

We are not able to reproduce what you are experiencing :( rc6 will be out today/tomorrow - I will update this ticket when it's out.

stefanonicotri commented 7 years ago

also in my case it never started. i have tried several times, but it always hangs on "Attaching to onezone-1" or on "onezone-1 Starting oz_panel: [ OK ]". If I try to connect to the web interface the error is "Fatal error: session cannot be initialized"

groundnuty commented 7 years ago

We identified this as a 'race condition' at startup. As always with concurrency bugs, they are not easy to reproduce. Suffice to say that, in my own tests it never happened :( One college experienced it yesterday, but he tried 2 more times and it all worked. It affects the startup, when startup passes onezone/oneprovider work as expected. We have found a developer responsible for it :) and it will be fixed in rc6.

marcvs commented 7 years ago

So since it was fixed in rc6, bas could you run your tests again?

bwegh commented 7 years ago

I did, yet it hangs now at registering as the bari zone is not running, so I will test again once the bari onezone is up.

groundnuty commented 7 years ago

fixed since rc7.