infinit / infinit

The Infinit policy-based software-defined storage platform.
http://infinit.sh
365 stars 13 forks source link

Issues creating Infinit volumes #46

Open aligg73 opened 7 years ago

aligg73 commented 7 years ago

Hi,

I followed the steps as listed here, for Debian, as I am on Ubuntu 16.04 LTS. All the steps complete normally, until I mount 2 volumes. For each individual command, Infinit gives me this notice: [infinit.model.doughnut.Doughnut] [main] Not using UTP although network configuration selects 'all' protocols. UTP has been temporarily deprecated. Force with INFINIT_UTP=1.

[ infinit.prometheus ] [main] infinit::prometheus::Prometheus(0x7f7ee487b410): creation failed, metrics will not be exposed: null context when constructing CivetServer. Possible problem binding to port.

Additionally I get this notice upon mounting the 2nd volume: [ infinit.prometheus ] [main] infinit::prometheus::Prometheus(0x7f7ee487b410): creation failed, metrics will not be exposed: null context when constructing CivetServer. Possible problem binding to port.

What should I do to prevent or fix this? Thank you, Alex

Dimrok commented 7 years ago

Hi @aligg73.

[infinit.model.doughnut.Doughnut] [main] Not using UTP although network configuration selects 'all' protocols. UTP has been temporarily deprecated. Force with INFINIT_UTP=1.

This is just a warning. We have some unresolved bugs triggered by nodes using the UTP protocol so it has been deprecated. You can ignore that message.

[ infinit.prometheus ] [main] infinit::prometheus::Prometheus(0x7f7ee487b410): creation failed, metrics will not be exposed: null context when constructing CivetServer. Possible problem binding to port.

It's also just a warning. Our default port for Prometheus (monitoring system) is 8080 with no fallback. You can also ignore this warning.

However, if you cannot mount your two volumes, there is an issue elsewhere... Can you access the first mountpoint?

If not, can you copy/paste the last ~5 lines of logs here please?

aligg73 commented 7 years ago

Will do, how do I get these last few lines please? Thank you.

aligg73 commented 7 years ago

I could export here the result of infinit doctor all, but it includes messages for the root user, which i am not using.

Dimrok commented 7 years ago

Just copy/paste from your terminal. By default, we send logs to the standard error channel.

I could export here the result of infinit doctor all, but it includes messages for the root user, which i am not using.

You can use infinit doctor all --as <your username> --verbose.

aligg73 commented 7 years ago

I got a different message the first time around, but now after I deleted the volume, created the volume again and then mount it, I get: :~# infinit: fatal error: unable to access mountpoint: boost::filesystem::status: Transport endpoint is not connected: "/home/mysqlstore"

Here's the order of command I have just issued: infinit volume delete --as turtle2017 turtle2017/mysqlstore infinit volume create --as turtle2017 --network dishcupboard --name mysqlstore --push infinit volume mount --as turtle2017 --name mysqlstore --mountpoint /home/mysqlstore --allow-root-creation --cache --publish &

The first 2 commands worked without issue.

Dimrok commented 7 years ago

Can you try elsewhere pls? Like /tmp/mysqlstore.

Edit: You probably have a dead mountpoint because the previous instance of infinit wasn't shutdown correctly. You can unmount by using sudo umount /home/mysqlstore or fusermount -u /home/mysqlstore

aligg73 commented 7 years ago
root@dishchef:~# infinit volume create --as turtle2017 --network dishcupboard --name redissstore --push
Locally saved volume descriptor "turtle2017/redissstore".
Remotely created volume descriptor "turtle2017/redissstore".

root@dishchef:~# infinit volume mount --as turtle2017 --name redisstore --mountpoint /tmp/redisstore --allow-root-creation --cache --publish &

[1] 8075

root@dishchef:~# infinit: fatal error: volume "turtle2017/redisstore" does not exist
^C
[1]+  Exit 1                  infinit volume mount --as turtle2017 --name redisstore --mountpoint /tmp/redisstore --allow-root-creation --cache --publish

root@dishchef:~# 

edit: [Dimrok] Markdown.

Dimrok commented 7 years ago

In this case, you misspelled redissstore (3s).

aligg73 commented 7 years ago

Ehhhh yes, not very smart, thanks for pointing that out. I can confirm it is working now, both in the /tmp/ location and in my desired location, after unmounting it, like you had suggested in your edit.

Thanks a lot!

aligg73 commented 7 years ago

I might have cheered too early, after continuing with your Debian tutorial (https://infinit.sh/get-started) I now get this on machine B: [ infinit.filesystem ] [fuse loop] unable to find root block, allow creation with --allow-root-creation

This happens after I issue: infinit volume mount --as turtle2017 --name redisstore --mountpoint /home/redisstore --cache --publish & all previous steps went without issue.

Dimrok commented 7 years ago

Hi. This means the second node didn't find the first running instance of Infinit (nor listed by --peer, nor from the hub through --publish).

Did you specify --publish when mounting the first one?

You can check https://beyond.infinit.sh/networks/turtle2017/dishcupboard (key endpoints).

aligg73 commented 7 years ago

Yes, I did specify --publish upon mounting the first volumes on node A.

Dimrok commented 7 years ago

Are your nodes on two different machines?

Do you know if they can see each others (through a public ip or a local subnetwork ip)? You can try by pinging IPs listed on https://beyond.infinit.sh/networks/turtle2017/dishcupboard.

aligg73 commented 7 years ago

The nodes are on 2 different machines indeed and their respective IP addresses can be pinged either way. I noticed this port in the hub link you have just provided: 34649 Does this mean this port should be open on both machines?

Thanks

aligg73 commented 7 years ago

Please let me know if I should open port 34649 or the problem is caused by something else, thanks.

Dimrok commented 7 years ago

Hi.

Yes you should. You can even force its value by using --port <port number>.

aligg73 commented 7 years ago

Thanks. I got the infinit volume mount commands running now on Node B. I only get to see that it is running in the background. No messages like fetch endpoints, running network, etc.

When I do infinit volume list I get to see both volumes. But when I try to: ls /home/redisstore ..my command line freezes.

I get this when running infinit doctor all --as turtle2017:

[elle.reactor.network.UTPSocket] [UTPSocket(192.241.139.66:5458) shutdown] UTPSocket(192.241.139.66:5458): UTP server was destroyed before us
CONFIGURATION INTEGRITY:
[OK] User
[OK] Silos
[OK] Networks
[OK] Volumes
[OK] Drives
[OK] Leftovers

SYSTEM SANITY:
[WARNING] Username: turtle2017
  Reason: default system user name "turtle2017" is not compatible with Infinit naming policies, you'll need to use --as <other_name>
[OK] Space left
[OK] Environment
[OK] Permissions
[OK] FUSE

CONNECTIVITY:
[OK] Connection to https://beyond.infinit.sh
[OK] Local interfaces
[OK] NAT
[ERROR] UPnP:
  Reason: UPNP device discovery failure: 0
[ERROR] Protocols:
  [ERROR] RDV UTP (XOR)
    Reason: Couldn't connect after 3 seconds
  [ERROR] UTP (XOR)
    Reason: Couldn't connect after 3 seconds

edit: [Dimrok] Markdown.

Dimrok commented 7 years ago

Hmm... you should at least have the basic messages (like you said fetch endpoints, running network, etc.).

Can you run your command with logs activated ELLE_LOG_LEVEL=*infinit*:TRACE infinit volume run ... and paste the output here please?

aligg73 commented 7 years ago

Not sure this is what you meant: root@dish1:~# ELLE_LOG_LEVEL=*infinit*:TRACE infinit volume run --as turtle2017 --name redisstore --mountpoint /home/redisstore --cache --publish & [2] 30206 root@dish1:~#

..so only a background process, nothing else.

Dimrok commented 7 years ago

It should write more logs in your terminal, weird.

Can you run ps faux | grep infinit to make sure you don't have other instances of infinit running pls?

aligg73 commented 7 years ago

yes a few too many were running, i killed all those processes. Now when trying again I get this: root@dish1:~# ELLE_LOG_LEVEL=*infinit*:TRACE infinit volume run --as turtle2017 --name redisstore --mountpoint /home/redisstore --cache --publish & [1] 6537 root@dish1:~# infinit: fatal error: unable to access mountpoint: boost::filesystem::status: Transport endpoint is not connected: "/home/redisstore"

Dimrok commented 7 years ago

Cool. At least, you have some output.

Now, you need to unmout this folder by running usermount -u /home/redisstore before trying to remount at the same location.

aligg73 commented 7 years ago

Thanks, that got me a step closer. Unmounting the folder now lead to this output:

[           infinit.model.Model            ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]   infinit::model::doughnut::Doughnut(0x83a5030800): insert dht::UB(0x62694b0201)
[        infinit.model.blocks.Block        ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]     dht::UB(0x62694b0201): seal at version null
[infinit.model.doughnut.consensus.Consensus] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]     dht::consensus::Cache(0x17072b0): store dht::UB(0x62694b0201)
[  infinit.model.doughnut.consensus.Cache  ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]       dht::consensus::Cache(0x17072b0): store 0x62694b0201
[infinit.model.doughnut.consensus.Consensus] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]         dht::consensus::Paxos(0x1707cb0): store dht::UB(0x62694b0201)
[  infinit.model.doughnut.consensus.Paxos  ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]           dht::consensus::Paxos(0x1707cb0): store dht::UB(0x62694b0201)
[         infinit.overlay.Overlay          ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]             infinit::overlay::kouncil::Kouncil(0x83a5030800): allocate 1 nodes for 0x62694b0201
[     infinit.model.doughnut.Doughnut      ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker]   infinit::model::doughnut::Doughnut(0x83a5030800): failed to store user block for turtle2017: no peer available for insertion of 0x62694b0201

edit: [Dimrok] Markdown.

Dimrok commented 7 years ago

no peer available for insertion means that this instance of Infinit couldn't connect to any other instances (providing storage).

Probably because the other peer was killed or is not accessible (anymore). Can you kill every single instance of Infinit running on your nodes and restart them with ELLE_LOG_LEVEL=*overlay*:TRACE infinit volume run/mount ... and copy both logs here pls?

aligg73 commented 7 years ago

Ok I killed every instance node B and node A, then I started it from node A, all good. Then from node B, I got this output:

[ infinit.overlay.kouncil.Kouncil ] [generator] infinit::overlay::kouncil::Kouncil(0x83a5030800): block 0x62694b0201 not found, checking all 0 peers [ infinit.model.doughnut.consensus.Paxos ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] could not find any owner for 0x62694b027ae85665ad4af466e89feefbafee4e354f6cad9e20fd05d28e795601 [ infinit.model.doughnut.Doughnut ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] infinit::model::doughnut::Doughnut(0x83a5030800): store user block for turtle2017 at 0x62694b0201 [ infinit.model.Model ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] infinit::model::doughnut::Doughnut(0x83a5030800): insert dht::UB(0x62694b0201) [ infinit.model.blocks.Block ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] dht::UB(0x62694b0201): seal at version null [infinit.model.doughnut.consensus.Consensus] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] dht::consensus::Cache(0x2692290): store dht::UB(0x62694b0201) [ infinit.model.doughnut.consensus.Cache ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] dht::consensus::Cache(0x2692290): store 0x62694b0201 [infinit.model.doughnut.consensus.Consensus] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] dht::consensus::Paxos(0x2692c90): store dht::UB(0x62694b0201) [ infinit.model.doughnut.consensus.Paxos ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] dht::consensus::Paxos(0x2692c90): store dht::UB(0x62694b0201) [ infinit.overlay.Overlay ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] infinit::overlay::kouncil::Kouncil(0x83a5030800): allocate 1 nodes for 0x62694b0201 [ infinit.model.doughnut.Doughnut ] [infinit::model::doughnut::Doughnut(0x83a5030800): user blocks checker] infinit::model::doughnut::Doughnut(0x83a5030800): failed to store user block for turtle2017: no peer available for insertion of 0x62694b0201

I wanna save you from having to apply Markdown all the time to my output, how do I do this? Thanks

Dimrok commented 7 years ago

I wanna save you from having to apply Markdown all the time to my output, how do I do this?

Use the triple backquotes formatting: ``` your code here ```

block 0x62694b0201 not found, checking all 0 peer

I'm pretty upset about this line... The 0 peers shows that you ... can not connect to the other peer.

Can you try to find the IP of both nodes and on each run: infinit doctor networking

This should show you something like

To perform tests, run the following command from another node:
infinit doctor networking --tcp-port 35853 --utp-port 43350 --xored-utp-port 42320 --host <address_of_this_machine>

Can you follow the instruction and run the line on the other machine (with the right IP address filled).

And do also the procedure on the other machine too pls?

aligg73 commented 7 years ago

from Node B, using Node A IP:

root@dish1:~# infinit doctor networking --tcp-port 43795 --utp-port 49239 --xored-utp-port 52170 --host [NodeA_IP]
Client mode (version: 0.8.0):
TCP:
  Unable to establish connection: network operation timed out
UTP:
  Unable to establish connection: connection refused
Segmentation fault
root@dish1:~# 

from Node A, using Node B IP:

root@dishchef:~infinit doctor networking --tcp-port 44453 --utp-port 53751 --xored-utp-port 58101 --host [NodeB_IP]
Client mode (version: 0.8.0):
TCP:
  Unable to establish connection: network operation timed out
UTP:
  Unable to establish connection: connection refused
Segmentation fault
root@dishchef:~# 

Could it be because all these ports sound new to me, and they are not allowed on the firewall? At least I don't see them opened when I do: sudo ufw status

edit: [Dimrok] Markdown (Do not forget to jump a line after opening and before closing triple backquotes).

Dimrok commented 7 years ago

Arf... segfaults. I'll check why.

By the way you can force the port and the protocol. Try with: infinit doctor networking --protocol=tcp --tcp-port <the port>.

aligg73 commented 7 years ago

Yes that works beautifully both ways, I get feedback like this from both ends, picking a port that is open on both sides:

TCP:
  Upload:
    44ms for 5.0 MiB (113.6 MiB/sec)
  Download:
    40ms for 5.0 MiB (125.0 MiB/sec)

So now that that is cleared up, does it mean I have to run this command again: infinit volume mount --as turtle2017 --name redisstore --mountpoint /home/redisstore --cache --publish &

but append it with --port and --protocol, like you suggested in the doctor networking example. Thanks again.

aligg73 commented 7 years ago

Hi,

Hope you can quickly review my last question and give me the heads up (or not). Appreciate your time taken, thanks. Alex

Dimrok commented 7 years ago

Oh I missed this one.

So you should:

Can you try that pls.

aligg73 commented 7 years ago

I managed to run the first one from Node A, although with some warnings:

root@dishchef:~# infinit volume mount --as turtle2017 --name mysqlstore --mountpoint /home/mysqlstore --port 4001 --allow-root-creation --cache --publish &
[1] 23506
root@dishchef:~# Running network "turtle2017/dishcupboard".
[infinit.Network] [main] client version: 0.8.0
[infinit.model.Model] [main] infinit::model::Model(0x1cc7c40): compatibility version 0.8.0
[infinit.model.doughnut.Doughnut] [main] Not using UTP although network configuration selects 'all' protocols. UTP has been temporarily deprecated. Force with INFINIT_UTP=1.
[ infinit.model.doughnut.Local  ] [main] dht::Local(0x1ccc5e8, 0xfb22925b00): listen on tcp://[::]:4001
[      infinit.prometheus       ] [main] infinit::prometheus::Prometheus(0x7f7414f4c410): listen on 127.0.0.1:8080
[      infinit.prometheus       ] [main] infinit::prometheus::Prometheus(0x7f7414f4c410): creation failed, metrics will not be exposed: null context when constructing CivetServer. Possible problem binding to port.
Remotely created endpoints for "turtle2017/dishcupboard".
Running volume "turtle2017/mysqlstore".

Then when I run the second one on node B with this command (masked the IP here, but changed it to IP of Node A:

root@dish1:~# infinit volume mount --as turtle2017 --name mysqlstore --mountpoint /home/mysqlstore --port 4001 --peer [IPNode_A] --cache --publish &
[1] 5356
root@dish1:~# Running network "turtle2017/dishcupboard".
[infinit.Network] [main] client version: 0.8.0
[infinit.model.Model] [main] infinit::model::Model(0x1037e20): compatibility version 0.8.0
[infinit.model.doughnut.Doughnut] [main] Not using UTP although network configuration selects 'all' protocols. UTP has been temporarily deprecated. Force with INFINIT_UTP=1.
[      infinit.prometheus       ] [main] infinit::prometheus::Prometheus(0x7fa48129e410): listen on 127.0.0.1:8080
[  infinit.model.doughnut.Dock  ] [rdv_connect] Dock RDV connect: exception thread termination: rdv_connect
infinit: fatal error: invalid endpoint: [IPNode_A]

I can confirm port 4001 is open on both Nodes.