uniuuu / zotprime

Fully packaged on-premise Zotero platform
https://www.zotero.org
GNU General Public License v3.0
71 stars 7 forks source link

CR: Support for hostname, adding CA, https in dockert compose setup. #16

Closed bradleyharden closed 1 year ago

bradleyharden commented 1 year ago

Hi,

I'm using the current production branch (2.7.0), but I can't seem to get file syncing to work. I can sync entries in my library, but not files. In particular, I get a series of errors like this in the Zotero client:

[JavaScript Error: "S3 returned 0 (1/3FCKK2HV) -- retrying upload"]

The client also never stops trying to sync. I can see repeated POST requests to the items path in the dataserver logs, but when I open the MinIO browser, the zotero and zotero-fulltext buckets are completely empty.

Interestingly, like @aitianshi, I also noticed that the .env file S3HOST points to 10.5.5.1 instead of 10.5.5.7. However, I tried switching the IP to 10.5.5.7 as well as minio, but neither seems to affect the behavior.

Finally, if I change the S3HOST to my-vm-url:9000 and expose port 9000 of the MinIO container, I get a different behavior. With that setup, I do actually see files in the MinIO bucket. But I am only ever able to sync exactly two files: no more, no less. After that, the dataserver starts issuing a series of curl error 7: couldn't connect to server errors, and I get 500 errors in the Zotero client.

Am I doing something wrong? Should file syncing work? Or is it not supported yet?

uniuuu commented 1 year ago

Hi @bradleyharden

[JavaScript Error: "S3 returned 0 (1/3FCKK2HV) -- retrying upload"]

Finally, if I change the S3HOST to my-vm-url:9000

That's mean you're using VM setup where your server is on computer/instance different from client. Ensure that during build of both client and server you're using configuration for VM and not Localhost.

Interestingly, like @aitianshi, I also noticed that the .env file S3HOST points to 10.5.5.1 instead of 10.5.5.7. However, I tried switching the IP to 10.5.5.7 as well as minio, but neither seems to affect the behavior.

This is correct configuration 10.5.5.1 an not 10.5.5.7.

May I know do you use hostname for your VM as my-vm-url ? If you use hostname then it may not work if one of your internal DNS resolvers isn't resolving it. The setup will work if you use VM IP address for both server and client configuration unless again you are using localhost setup scenario. Refer to https://github.com/uniuuu/zotprime#localhost-and-vm-installation URL's are supported for kubernetes setup currently.

Please let me know which one you use localhost or VM? And then try the above approach and if you still have issue then I'll request logs and more details about how the setup done.

bradleyharden commented 1 year ago

@uniuuu, sorry, I'll try to be more explicit.

That's mean you're using VM setup where your server is on computer/instance different from client. Ensure that during build of both client and server you're using configuration for VM and not Localhost.

Yes, I am running in a VM setup. And yes, I made sure to build both the client and the server using the VM hostname.

This is correct configuration 10.5.5.1 an not 10.5.5.7.

Ok. Thanks for the clarification. I thought it must be correct, because I looked at the Git blame for that line, and the change looked intentional.

May I know do you use hostname for your VM as my-vm-url? If you use hostname then it may not work if one of your internal DNS resolvers isn't resolving it.

Yes, I am using the hostname. The VM is running on an internal, private network. I usually don't have any problem with DNS over that network. For example, I can use my laptop to access the MinIO console at my-vm-url:9001. Unless there is something subtle going on that I don't understand, I don't think DNS is the problem here. The Zotero client is able to connect to both the dataserver and the stream-server. I can see evidence of both connections in the logs. However, just to be sure, I will try using the IP of the VM instead.

Please let me know which one you use localhost or VM? And then try the above approach and if you still have issue then I'll request logs and more details about how the setup done.

I think I've done everything correctly. I tried to follow the instructions as closely as possible. Thank you in advance for any additional help you can provide.

Oh, I should add a few more details, since they might be relevant. I am trying to run this on an internal network, and we have our own root certificate. I added that certificate into the dataserver, stream-server, tinymce-clean-server and the zotero-client Dockerfiles. I used the docker-compose-dev.yml file, so all the containers were built locally. Is that enough? Do any of the other Dockerfiles need to be modified?

For those 4 Dockerfiles, I had to use two slightly different approaches, depending on whether I could use wget and https without installing any additional packages. For the zotero-client and the dataserver, I added this:

RUN mkdir -p /usr/local/share/ca-certificates/ \
 && wget https://internal-url/cert.cer -O /usr/local/share/ca-certificates/cert.crt \
 && cat /usr/local/share/ca-certificates/cert.crt >> /etc/ssl/certs/ca-certificates.crt

And for the stream-server and tinymce-clean-server , I added this

COPY cert.crt /usr/local/share/ca-certificates/cert.crt
RUN mkdir -p /etc/ssl/certs \
 && touch /etc/ssl/certs/ca-certificates.crt \
 && cat /usr/local/share/ca-certificates/cert.crt >> /etc/ssl/certs/ca-certificates.crt

I also add

RUN npm config set cafile /etc/ssl/certs/ca-certificates.crt

to any Dockerfiles running npm.

uniuuu commented 1 year ago

Hi @bradleyharden Please try to switch to IP address and not hostname (or URL). Hostname doesn't work because DNS resolvers do not resolve it inside docker container. At this point it's not tested and not supported to use hostname. Moreover if there is a need to add hostname support for docker compose setup then the best approach would be at add reverse proxy container to be able handle both URL's or/and hostnames plus a bonus will be SSL termination.

For URL's I would recommend to use on-premise Kubernetes (Microk8s) instead of docker compose. I'll write a guide for it and let you know when it's added.

Yes, I am running in a VM setup. And yes, I made sure to build both the client and the server using the VM hostname.

Yes, I am using the hostname. The VM is running on an internal, private network. I usually don't have any problem with DNS over that network. For example, I can use my laptop to access the MinIO console at my-vm-url:9001. Unless there is something subtle going on that I don't understand, I don't think DNS is the problem here. The Zotero client is able to connect to both the dataserver and the stream-server. I can see evidence of both connections in the logs. However, just to be sure, I will try using the IP of the VM instead.

bradleyharden commented 1 year ago

Hi @uniuuu,

I still see the same behavior when using an IP address.

To eliminate any possibility that the internal network or Dockerfile changes have any effect, I tried to reproduce the problem using the localhost setup a local laptop connected to the open internet.

Here's what I did:

At this point, I found that the minio/mc container was giving me errors with curl. I guessed that something had changed in the past few days, so I pinned the miniomc.Dockerfile to minio/mc:RELEASE.2023-10-14T01-57-03Z. Then I re-ran sudo docker compose build.

Then, I tried again:

At this point, I still see the same behavior. I get S3 returned 0 errors in the console. I see POST requests to the dataserver. But when I log in to MinIO through localhost:9001, I don't see any files in either of the two buckets.

Am I doing something wrong here?

aitianshi commented 1 year ago

Except for the preprint issue we have found @uniuuu I still have the issue @bradleyharden is explaining for other docs if I don't change Minio IP. There's something been missing with all default values using the whole Zotero in local.

uniuuu commented 1 year ago

To eliminate any possibility that the internal network or Dockerfile changes have any effect, I tried to reproduce the problem using the localhost setup a local laptop connected to the open internet.

* Copied `.env_example` to `.env` and `docker-compose-dev.yml` to `docker-compose.yml`

Hi @bradleyharden @aitianshi You're good to go with IP address, no need to use localhost unless your client and server on the same single machine without VM. If you're still having the error after below described step then we'll troubleshoot it with IP address installation first. Please try to use prod yaml file and not dev as per the guide in readme. And there is no build step for server too but instead it will pull prod images from dockerhub. VM/Localhost installation $ cp docker-compose-prod.yml docker-compose.yml

* Cloned a fresh copy of the repo

* Built the client with `DOCKER_BUILDKIT=1 sudo docker build --progress=plain --file client.Dockerfile --output build .`

Another step you may miss is to switch to production branch as per guide.

$ git clone --recursive https://github.com/uniuuu/zotprime.git
$ git checkout production   
bradleyharden commented 1 year ago

@uniuuu,

Either way, I will try using the -prod file.

aitianshi commented 1 year ago

@bradleyharden If you can import some files from the outside to your internal network, you can try the solution there: https://github.com/uniuuu/zotprime/issues/12. I had the same issue but I could build everything from a public network then import all images to my private network, suggested by @uniuuu .

bradleyharden commented 1 year ago

@aitianshi, yes, I saw that suggestion in your issue. It could work for me, but it's also pretty easy for me to modify the Dockerfiles and rebuild. I would prefer that, if possible, so that I have a record and a way to reproduce things without manual intervention.

uniuuu commented 1 year ago

@uniuuu,

* Yes, I am already using the `production` branch, which points to `v2.7.0`.

* I would prefer, if possible, to build all the images myself. That's why I chose `docker-compose-dev.yml`. Is that not a valid choice? If `-prod` works but `-dev` does not, isn't that a bug?

* It is easier for me to reproduce your work if I use the `localhost` configuration, because I can do that on a single laptop connected to the open internet. My VM is on an internal network and all traffic is re-signed with our root certificate. If I want to test with the VM, I need to modify the images to include our root certificate.

Either way, I will try using the -prod file.

Hi @bradleyharden

The pointed out issue that dev on your side doesn't work - is still okay as it's meant for development purpose and not production level deployment.

That's understood that you want to build images. But the images you'll build won't guarantee their consistency unless at your side all is working, and no issues with getting dependencies. So you have to keep control over your build and ensure it wasn't impacted by your environment setup or limits. Anyway it's best practice to use prebuild and verified images. So i have made changes to production branch to avoid confusion by having dev docker-compose there.

I am open to troubleshoot the image build failure. But for this you have provide details what do you modify in images exactly? And then also do provide output saved in file of the progress build of the images which has to be done by next command: BUILDKIT_PROGRESS=plain docker compose build

Also I have added microk8s guide microk8s-installation and I encourage to switch from docker to k8s for better server handling experience.

bradleyharden commented 1 year ago

@uniuuu, I won't be able to try anything until later today, but I wanted to clarify one thing now.

In my last attempt, using the localhost configuration, I didn't make any changes to the Dockerfiles, docker-compose.yml or .env file. The only exception was to pin the version of the minio/mc image to RELEASE.2023-10-14T01-57-03Z, because the latest version no long comes with curl installed.

I will try with the pre-built images, but I think I've done that before and seen the same errors. I will try again to be certain.

uniuuu commented 1 year ago

Hi @bradleyharden @aitianshi I couldn't reproduce the issue on my side. Could you please write on which Linux do you run server ? What's docker version?

bradleyharden commented 1 year ago

@uniuuu, sorry, I'm just very busy at the moment. I still want to figure this out. I will try to get back to you soon. Thanks for your help.

bradleyharden commented 1 year ago

@uniuuu, I did as you asked and tried again using the pre-built containers. I just cloned a fresh copy, switched to the most recent production branch (v2.7.2 now), built the Zotero client, ran docker compose up, and then ran ./bin/init.sh. I get the same behavior as before.

To reiterate, I'm running on an Ubuntu 22.04 laptop connected to the open internet. I'm not behind an internal network. I'm also running in the localhost configuration.

Here is the output of docker info:

Client: Docker Engine - Community
 Version:    24.0.7
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.21.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 10
  Running: 10
  Paused: 0
  Stopped: 0
 Images: 10
 Server Version: 24.0.7
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 61f9fd88f79f081d64d6fa3bb1a0dc71ec870523
 runc version: v1.1.9-0-gccaecfc
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.2.0-34-generic
 Operating System: Ubuntu 22.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.29GiB
 Name: hardebj1-ll3
 ID: f1dea25b-c36a-46f2-8cae-6581dd32a3d4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
bradleyharden commented 1 year ago

I can also confirm that the suggestion by @aitianshi to change the minio IP in .env does change the behavior. With 10.5.5.7, I can now see files stored in MinIO, but once again, it is only 2 files, even though I would expect more.

uniuuu commented 1 year ago

Hi @bradleyharden

Thank you for the information provided. The issue has been fixed in commit https://github.com/uniuuu/zotprime/commit/aef5cdaa939bf0fee048a33bab38d3d33fa3828d Please try re-clone the production branch and test it to verify that it's no more recurring.

bradleyharden commented 1 year ago

@uniuuu, exposing port 9000 of the MinIO container is something I had tried on my own once before. That change removes the S3 returned 0 errors, but it replaces them with errors of the form:

[JavaScript Error: "HTTP POST http://localhost:8080/users/1/items/3UYBIXN9/file failed with status code 500:

An error occurred"]

I just confirmed that I still see this behavior. I updated my repo to v2.7.3, did a docker system prune and rebuilt everything from scratch. I get the same 500 response I had gotten before. And once again, I see exactly 2 files in the zotero bucket from the MinIO console.

bradleyharden commented 1 year ago

I will copy some lines from the logs that I think might be relevant:

zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:16.642020 2023] [php:notice] [pid 738] [client 10.5.5.1:42568] Deadlock found when trying t
o get lock; try restarting transaction\n\nShard: 0\n\nQuery:\nINSERT INTO storageFiles (hash, filename, size, zip) VALUES (?,?,?,?)\n\nParams:\nAr
ray\n(\n    [0] => 5b859a4546d1c4c432a19362c3e91495\n    [1] => 3UYBIXN9.zip\n    [2] => 1008678\n    [3] => 1\n)\n\n\n in /var/www/zotero/include
/DB.inc.php:1303 (POST /users/1/items/3UYBIXN9/file) (05effef2c8)
zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:54.364149 2023] [php:warn] [pid 735] [client 10.5.5.1:54624] PHP Warning:  iconv(): Wrong e
ncoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in /var/www/zotero/model/Item.inc.php on line 3159
zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:54.365647 2023] [php:notice] [pid 735] [client 10.5.5.1:54624] You have an error in your SQ
L syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'AS new\n\t\t\t\tON DUPLICATE KEY UPDA
TE storageFileID=new.storageFileID, mtime=new...' at line 1\n\nShard: 1\n\nQuery:\nINSERT INTO storageFileItems (storageFileID, itemID, mtime, siz
e) VALUES (?,?,?,?) AS new\n\t\t\t\tON DUPLICATE KEY UPDATE storageFileID=new.storageFileID, mtime=new.mtime, size=new.size\n\nParams:\nArray\n(\n
)\n\n\n\n\nShard: 1\n\nQuery:\nINSERT INTO storageFileItems (storageFileID, itemID, mtime, size) VALUES (?,?,?,?) AS new\n\t\t\t\tON DUPLICATE KEY
 UPDATE storageFileID=new.storageFileID, mtime=new.mtime, size=new.size\n\nParams:\nArray\n(\n    [0] => 9\n    [1] => 77\n    [2] => 169972462300
0\n    [3] => 1008678\n)\n\n\n in /var/www/zotero/include/DB.inc.php:1303 (POST /users/1/items/3UYBIXN9/file) (a64a43cae6)
bradleyharden commented 1 year ago

As a separate question, I noticed something else recently. The stream server repeatedly prints this line:

zotprime-zotprime-streamserver-1   | [11/Nov/2023:12:56:05 -0500] [0 connections, 0 subscriptions]

When the Zotero client was open, I used to see non-zero values for both connections and subscriptions. But lately, those values are always zero, whether the client is open or not. Is that the expected behavior? Is it relevant to this problem at all?

uniuuu commented 1 year ago

Hi @bradleyharden

zotprime-zotprime-streamserver-1   | [11/Nov/2023:12:56:05 -0500] [0 connections, 0 subscriptions]

You may have changes with your IP and now the client cannot connect to the stream server. So try to rebuild client and ensure IP address set the same as VM's IP address. See below next my comment.

  • It is easier for me to reproduce your work if I use the localhost configuration, because I can do that on a single laptop connected to the open internet. My VM is on an internal network and all traffic is re-signed with our root certificate. If I want to test with the VM, I need to modify the images to include our root certificate.

If you use VM you cannot use localhost setup. This is probably the root cause of the issue as it's confusing past in README.md. The localhost setup - means client and server are sitting in one single host. Because localhost resolves to IP loopback address 127.0.0.1. So if you have VM then there are already two hosts - one host is VM and it has it's own IP address, second host is your laptop and it's has it's own IP address. And if you run client from laptop then localhost will point to laptop's loopback address. But your server is in VM and localhost points to VM's loopback address. So the client cannot communicate with server in VM if you use localhost setup. For VM setup you must use IP address of the VM. If your VM's network is bridged then it will be VM's IP address. If VM's network behind NAT then it will be IP that set as VM's gateway and you have to do port-forwarding for ZotPrime ports. If this is correct and it was the issue, may I ask how you see what improvements can be added into README to make this clear regarding selection of setup? In worse case scenario if this will keep to confuse then best is to consider removing locahost setup option from description. I doubt localhost setup is common setup now because it's more like for testing purposes or for quick run or when you don't use VM in laptop. Here we have the server part and the client part - server should have it's own host when in production. The localhost setup is inherited from original ZotPrime repository I do not know intention behind it except my assumption that it was just very quick to make that time and for VM setup it was additional effort to add-on. Anyway I saw in posts some that people who run original ZotPrime have done VM setup themselves that time by modifying configurations.

My VM is on an internal network and all traffic is re-signed with our root certificate. If I want to test with the VM, I need to modify the images to include our root certificate.

Hi @bradleyharden Please try to switch to IP address and not hostname (or URL). Hostname doesn't work because DNS resolvers do not resolve it inside docker container. At this point it's not tested and not supported to use hostname. Moreover if there is a need to add hostname support for docker compose setup then the best approach would be at add reverse proxy container to be able handle both URL's or/and hostnames plus a bonus will be SSL termination.

For URL's I would recommend to use on-premise Kubernetes (Microk8s) instead of docker compose. I'll write a guide for it and let you know when it's added.

If I got you right what you are trying to achieve is to add HTTPS (which will require SSL/TLS termination on all ports used). I confirm the current ZotPrime docker compose doesn't support it. Modifying images probably will lead to issues as it's never been tested yet. The best approach for HTTPS support is to add proxy service which will allow to add your certificate and do SSL/TLS termination. I find docker compose is not best solution for production. The docker swarm and kubernetes are proven production grade solutions for that. If Microk8s setup is something that most people will find hard to setup then switching from docker compose to docker swarm with adding proxy service (Caddy/Envoy/Traefik) can be considered.

RUN mkdir -p /usr/local/share/ca-certificates/ \
 && wget https://internal-url/cert.cer -O /usr/local/share/ca-certificates/cert.crt \
 && cat /usr/local/share/ca-certificates/cert.crt >> /etc/ssl/certs/ca-certificates.crt

Installing CA for each OS is not yet tested/supported. But I'll study this question what would be the best approach to add support for it.

[JavaScript Error: "HTTP POST http://localhost:8080/users/1/items/3UYBIXN9/file failed with status code 500:

An error occurred"]
zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:16.642020 2023] [php:notice] [pid 738] [client 10.5.5.1:42568] Deadlock found when trying t
o get lock; try restarting transaction\n\nShard: 0\n\nQuery:\nINSERT INTO storageFiles (hash, filename, size, zip) VALUES (?,?,?,?)\n\nParams:\nAr
ray\n(\n    [0] => 5b859a4546d1c4c432a19362c3e91495\n    [1] => 3UYBIXN9.zip\n    [2] => 1008678\n    [3] => 1\n)\n\n\n in /var/www/zotero/include
/DB.inc.php:1303 (POST /users/1/items/3UYBIXN9/file) (05effef2c8)
zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:54.364149 2023] [php:warn] [pid 735] [client 10.5.5.1:54624] PHP Warning:  iconv(): Wrong e
ncoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in /var/www/zotero/model/Item.inc.php on line 3159
zotprime-zotprime-dataserver-1     | [Sat Nov 11 12:44:54.365647 2023] [php:notice] [pid 735] [client 10.5.5.1:54624] You have an error in your SQ
L syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'AS new\n\t\t\t\tON DUPLICATE KEY UPDA
TE storageFileID=new.storageFileID, mtime=new...' at line 1\n\nShard: 1\n\nQuery:\nINSERT INTO storageFileItems (storageFileID, itemID, mtime, siz
e) VALUES (?,?,?,?) AS new\n\t\t\t\tON DUPLICATE KEY UPDATE storageFileID=new.storageFileID, mtime=new.mtime, size=new.size\n\nParams:\nArray\n(\n
)\n\n\n\n\nShard: 1\n\nQuery:\nINSERT INTO storageFileItems (storageFileID, itemID, mtime, size) VALUES (?,?,?,?) AS new\n\t\t\t\tON DUPLICATE KEY
 UPDATE storageFileID=new.storageFileID, mtime=new.mtime, size=new.size\n\nParams:\nArray\n(\n    [0] => 9\n    [1] => 77\n    [2] => 169972462300
0\n    [3] => 1008678\n)\n\n\n in /var/www/zotero/include/DB.inc.php:1303 (POST /users/1/items/3UYBIXN9/file) (a64a43cae6)

When you do reinstall ZotPrime ensure the volumes all are deleted. Do next command that removes all containers and deletes all associated volumes: docker compose down -v Just in case some issue possible due to file format or content please share also a file sample.

A quintessential for this case would be that before you'll try to implement it your target production environment, first you need to try to setup in testing environment without doing any changes and do not use any restricted network or with policies that require to have TLS . And go with VM setup. Build client with setting IP address of VM (VM with bridged network).

Attached is 3 min video showing steps deleting client's data, deleting server, creating a new server, initialization, connecting client.

https://github.com/uniuuu/zotprime/assets/26146469/f47fb5b2-ce36-42c2-afd7-6db928b86aaa

uniuuu commented 1 year ago

Hi @bradleyharden Good day. Please let me know if it helped to resolve the issue. Since there is no update I'll close this issue and convert it to discussion. Feel free to post your reply in the discussion or reopen this issue if you find that the reported error is still not solved.