agarbato / unicloud

Unison file sync web interface
MIT License
66 stars 5 forks source link

Files not being transfered #16

Closed emil10001 closed 2 years ago

emil10001 commented 2 years ago

I've got two separate machines on my local network that I'd like to get synced up together. The client is able to connect to the server, and all of the files are currently residing on the server, to be transferred to the client.

Here are the errors I'm getting:

[ 3660 ] Event Log
[ my-client ][ share1 ][ 2022-01-25 16:25:35 ][ 2022-01-25 16:25:35 ][ 0s ][ KO ]
Unison 2.51.3 (ocaml 4.13.0): Contacting server...
Permission denied, please try again.
Permission denied, please try again.
unison@my-server: Permission denied (publickey,password,keyboard-interactive).
Fatal error: Lost connection with the server

I'm confused about how the client is able to connect, but is unable to transfer files. When I attempt to run the ssh command manually, it gives me a challenge for a password, and I don't know what that is, or where to find it. That challenge seems like it may be causing the issue that I'm seeing reported above.

agarbato commented 2 years ago

Hi @emil10001 If the client is able to reach the server it means that the port is open so it seems related to key not properly added to the authorized keys. Did you complete the client registration? Do you see any message on the homepage or on the clients page? When you activate a client its key is added to the authorized keys of the server. On /data/.ssh/authorized_keys on the server you should see the client public key. Make sure to persist data volumes on both client and server. Do you mind to share your docker-compose.yml file? Maybe cleanup data folder on both server/client (make sure to remove hidden files also) and start fresh using the docker-compose.yml provided on the repo.

agarbato commented 2 years ago

Just to give you more context.

On server /data/.ssh/unicloud_authorized_keys you should have the client key. Something like this:

command="/usr/bin/unison -server" ssh-rsa AAAAB3NzaC1yc2......= unison@aa210add955c CLIENT:testing-client1%

On client /data/.ssh you should have both id_rsa and id_rsa.pub. The pub key should match the one on authorized_keys but this is normally done automatically when you register the client.

If you clone the repo and run docker-compose up -d you can test it locally first. The docker-compose.yml is already configured to persist volumes from the repo folders.

emil10001 commented 2 years ago

Thank you for the quick reply and help!

Updates at the top

Re-registered the pub key for the client and now my errors are different:

[ 3776 ] Event Log
[ my-win-client ][ share1 ][ 2022-01-26 02:54:34 ][ None ][ Nones ][ KO ]
Sync was interrupted

More info below

Ah, that's helpful context, and I may try the cleanup step if I can't fix it another way.

Did you complete the client registration? - Ok, while I had done the registration, and the client was reporting itself as connected, i checked the client's ssh pub key, and it was not in the server's unicloud_authorized_keys file. I'm not clear why it changed, but I was messing with the config a bit, and bouncing between which box was going to be the server.

Do you see any message on the homepage or on the clients page? - homepage All clients are registered. Clients page is weird, one of the two clients is showing with a Last Seen, which is the one that matches the client name provided in my docker-compose file, but the one with a fake name where the key matches is not showing a Last Seen, but does have a Join Date. When I click on the client name for the one with the fake name and real key, I get an error page.

Make sure to persist data volumes on both client and server. - what does this mean?

Server docker-compose.yml running on a physical Ubuntu machine with a bunch of other docker containers

version: '3.3'
services:
  # SERVER
  test_unicloud_server:
    image: agarbato1/unison-unicloud
    container_name: unison-server
    ports:
      - "2222:22"
      - "5001:80"
    environment:
      - USER=unison
      - USER_UID=1000
     # - USER_GIDS=33,14
      - SERVER_HOSTNAME=unicloud_testing_server
      - SERVER_UI_USERNAME=admin
      - SERVER_UI_PASSWORD=pass
      - SERVER_DEBUG=True
      - ROLE=SERVER
    networks:
      unison:
        aliases:
          - my-uni-server
    # added dns to get the container to resolve hostnames on my local network
    dns:
      - 192.168.1.1
    volumes:
      - type: bind
        source: /mnt/ssd1/misc/unison_data/server
        target: /data
      - type: bind
        source: /mnt/ssd1/misc/unison_data/server/shares
        target: /shares
      - type: bind
        source: /mnt/ssd1/media
        target: /shares/share1/my-srv/media
      - type: bind
        source: /mnt/ssd2/media2
        target: /shares/share1/my-srv/media2
networks:
  unison:
    driver: bridge
    ipam:
      driver: default

Client docker-compose.yml running on a physical Windows machine, inside an Ubuntu WSL2 environment with WSL2 enabled in docker

version: '3.3'
services:
  # CLIENT
  test_unicloud_client:
    image: agarbato1/unison-unicloud
    container_name: unison-client
    environment:
      - CLIENT_HOSTNAME=my-win-client
      - ROLE=CLIENT
      - USER=unison
      - USER_UID=1000
      # my-ubu-srv is the local hostname of the physical box the container is running on
      - SERVER_HOSTNAME=my-ubu-srv 
      - SERVER_PORT=2222 # needed to change this from default to get it to work
      - SERVER_SHARE=share1
      - SHARE_IGNORE=.git*|.idea|.unison|.DS_Store
      - API_PROTOCOL=http
      - API_PORT=5001 # needed to change this from default to get it to work
      - SYNC_INTERVAL=15
    restart: on-failure
    volumes:
      - type: bind
        source: ./local_tests/client
        target: /data
      - type: bind
        source: ./local_tests/client/share
        target: /data/share
      - type: bind
        source: /mnt/d/my-srv
        target: /data/share/my-srv
    networks:
      unison-client:
    # added dns to get the container to resolve hostnames on my local network
    dns:
      - 192.168.1.1

networks:
  unison-client:
    driver: bridge
    ipam:
      driver: default
agarbato commented 2 years ago

Hi again,

My suggestion is still to start fresh with a clean config since there might be some broken config around if you swapped client/server role between machines. I'll explain how to cleanup and then give you some more hints on how to troubleshoot the connection.

Before you start :

On client docker-compose you have 3 volumes. This is wrong. You only need 2, one for the data volume and one for share data sync. You should not change the target, it must be /data/share. Source can be any path on your filesystem. A client can only sync one share.

This should be ok:

volumes:
  - type: bind
    source: ./local_tests/client
    target: /data
  - type: bind
    source: /mnt/d/my-srv
    target: /data/share

server docker-compose volumes seems a bit confusing to me. When you add a share from the UI the path must match volume target. Final target can be created automatically or manually (see below)

I would change to something like this

volumes:
  - type: bind
    source: /mnt/ssd1/misc/unison_data/server
    target: /data
  - type: bind
    source: /mnt/ssd1/media
    target: /shares
  - type: bind
    source: /mnt/ssd2/media2/share2
    target: /shares/share2

The app can create destination folder when you add a share.

image

This will result in /mnt/ssd1/media/share1 being created automatically Generally speaking if you only have one media volume you just need 2 volumes on the server, one for data and the other for shares with target /shares, final paths will be created automatically.

Since you have a media on a different disk to add the 2nd share on ssd2 select Create Folder : No - Path: /shares/share2

Maybe start with a single /shares volume than you can add more.

Cleanup

Stop everything with:

docker-compose stop
docker-compose rm

On server machine :

rm -rf /mnt/ssd1/misc/unison_data/server
mkdir -p /mnt/ssd1/misc/unison_data/server

On client machine :

rm -rf ./local_tests/client
mkdir -p ./local_tests/client

Troubleshooting :

So if you get sync was interrupted it means key exchange is ok now and client key is now on server authorized_keys. You can test connectivity manually inside the docker container.

From the client machine

docker-compose exec test_unicloud_client /bin/bash   
apk update
apk add busybox-extras
su - unison
nslookup my-ubu-srv
busybox-extras telnet  my-ubu-srv 2222
unison unicloud

Check that path/port are correct on unison profile /data/.unison/unicloud.prf If not there must be an error on the share path when you created it.

An example profile should look like this :

root=ssh://unison@unicloud_server:22//shares/share1
root=/data/share
clientHostName=testing-client1
batch = true
auto = true
prefer = newer
log = false
#place additional params with UNISON_PARAMS env
ignore = Name .git*

If you want to remove just the client you can remove it from the UI. Client page use the tools icon to manage clients "/clients/mgt"

image

If you remove a client and then you plan to register again make sure you delete the data folder like I suggested above.

rm -rf ./local_tests/client
mkdir -p ./local_tests/client

I hope you can fix with a clean config :-)

Let me know how I can make the doc better if something is unclear.

emil10001 commented 2 years ago

Wanted to give a quick update, as those suggestions helped, but it's still not working.

Unison 2.51.3 (ocaml 4.13.0): Contacting server...
Connected [//b99072bc7e7d//shares/share1 -> //my-win-client//data/share]
Looking for changes
Waiting for changes from server
Connection to my-ubuntu closed by remote host.
Fatal error: Lost connection with the server

So, now it's some ssh issue that I'm starting to chase down. I'll update when I either figure out what's going on, or get stuck again =) . Hopefully I'll get some time over the weekend to poke at this more.

agarbato commented 2 years ago

Thanks for your feedback @emil10001 To me It seems something related to wsl2 Windows env. I never did any test with Windows but I saw a lot of people complaining about ssh not working properly.

I can't test it but this could be a possible fix. Try do add on your win client machine an ssh config file.

/data/.ssh/config

Host *
KexAlgorithms=ecdh-sha2-nistp521

https://github.com/microsoft/WSL/issues/4208 https://github.com/microsoft/WSL/issues/5755

Let me know if it works.

emil10001 commented 2 years ago

Hmm, thanks for the tip, but that didn't seem to help. I did also change the timezone to my local time, and rebuilt both client and server again. Back to registering, and seeing the correct timestamps now, but still hitting the same error.

One thing that I am struggling with at this point is how to get logging enabled for ssh and where to find the logs. I'm seeing a few separate sshd_config files, and enabling logging in them doesn't seem to actually generate logs that I can find, either in /var/log or in /data/log. I'm hoping to get more hints about what's going on if I can get the server to start logging from the ssh server.

Additionally, from the client side, if I attempt to ssh to the server as the user unison, it gives me the following:

$ ssh unison@my-uni-server -p 2222
Unison 2.51 with OCaml >= 4.01.2 

So, no obvious error, but no command prompt either. I didn't leave it open long enough to timeout, as I'd rather try to chase server-side logs on that. I did verify that the client container was resolving the server hostname correctly.

agarbato commented 2 years ago

It's normal not to have a command prompt in response to ssh request. Client is only authorized to run unison command. It's a security constraint to avoid ssh client to run unauthorized commands on your server. It's defined by the line on authorized_keys of the server

command="/usr/bin/unison -server" ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB... CLIENT:testing-client1

Did you try to run the command below to start a sync? Basically this is what the client does every SYNC_PERIOD plus some api calls to register events.

unison unicloud

This should test entirely the ssh connection since sync is done with unison+ssh.

If you want to get ssh logs you need to do some changes on ssh config and supervisord of the server

On /data/etc/supervised.conf replace entirely [program:sshd] section as follow : Make sure you replace all as you need to add -e parameter to ssh command to enable log redirection.

[program:sshd]
command = /usr/sbin/sshd -e -D -f /etc/sshd_config
stdout_logfile = /data/log/sshd.log
stderr_logfile = /data/log/sshd.log

On _/etc/sshdconfig you need to add or uncomment these lines:

# Logging
SyslogFacility AUTH
LogLevel VERBOSE

Reload supervisord with:

supervisorctl reload

Wait a few seconds and confirm everything is running (until you see everything RUNNING) with:

supervisorctl status

On /data/log/sshd.log you should see your logs.

Hope it helps :-)

emil10001 commented 2 years ago

Made some progress with this. I have rebuilt a few times at this point, but one change that I kept across attempts was a change to the client's ~/.ssh/config, both to add the algorithm, but also a keep-alive:

Host *
KexAlgorithms=ecdh-sha2-nistp521
ServerAliveInterval 60
ServerAliveCountMax 10

I was able to run with my larger directories removed, and see bi-directional syncing happening with some empty text files, though there were some hiccups with file permissions.

I added back my actual data directories on the server side, which are around 6TB, so I'm expecting this to take a while. Currently, I'm seeing some temp directories on the client receiving the files, and it's starting to get populated with the expected structure.

The only issue now is that while the first sync ran for about 7 hours before failing, and since then it's been failing every 5 minutes, all with Sync was interrupted. When I inspect the temp directories, the directories will randomly disappear and re-appear.

I just changed all permissions to my user, so that they show up as being owned by unison:unison in the server container, and will let it run for a few hours and see if things improve.

Thanks again for all the help! I'll keep you posted on progress on this. Hoping the file permissions are the final issue.

Edit: I am still suspecting that this is WSL2 issues. I'm going to close this out. Thanks for all the help, I think there's a ton of useful info for others here, when someone else wants to chase debugging this issue.

agarbato commented 2 years ago

Thanks @emil10001 Glad to help.

Since ssh-options can be also added to unison profile I'll add a new docker ENV to pass those to avoid creating an ssh config file.

agarbato commented 2 years ago

I had a quick review of the code and there's already a generic ENV for that. Just add this VAR to your client config docker-compose.yml No single or double quote needed.

UNISON_PARAMS=sshargs = -o KexAlgorithms=ecdh-sha2-nistp521 -o ServerAliveInterval=60 -o ServerAliveCountMax=10