zotero / dataserver

Zotero Data Server
Other
280 stars 73 forks source link

self hosting documentation #105

Open Hugo-Trentesaux opened 4 years ago

Hugo-Trentesaux commented 4 years ago

I found no issue dealing with "self hosting" and I think this process should be documented.

According to this post on the zotero forum, "Zotero makes the code available but doesn't provide any support for self-hosted servers."

This is something I'd like to dig in, so consider this as a "tracking issue", not "feature request".

douglasg14b commented 4 years ago

Figured much out on this end? I'd love to self-host my Zotero servers so I can sync large documents to it, I am not going to pay $10/m because I have a couple dozen large docs...

I also have multiple devices (Work & Personal) that I want to sync between, so I have one document repository for them.

Hugo-Trentesaux commented 4 years ago

I did not look into it. For me it's not a money problem rather a philosophical one. I won't engage in a project which does not fully comply with the free software idea. It's not the same if the Zotero project wants to let people set up their own servers (1) or if they prefer to keep them into their ecosystem (2). I will only consider a paid plan if option 2 is valid.

dstillman commented 4 years ago

I'm not sure what you think the "free software idea" is, but the idea that there's some universally accepted definition of it that we're in violation of is absurd. Nearly everything we produce is open source, including the dataserver, which is AGPL-licensed. You are perfectly free to run your own version, and various labs and companies do so. We just don't currently have the time or resources to provide any support for doing so.

This isn't a documentation issue — it's a technical and resource one. Creating a version that's easy to self-host would just be a completely different project from running a hosted service for millions of users on a sprawling AWS infrastructure with countless services, databases, database shards, caches, search clusters, Lambdas, etc., and all the processes for deploying, debugging, upgrading, and monitoring it, not to mention managing compatibility across different versions of Zotero clients, would be completely different. We still do hope to make a more easily self-hostable version at some point, but for now our priority is maintaining the service for Zotero users.

Hugo-Trentesaux commented 4 years ago

I understand the point. Everything falls back to a resource allocation problem. That's why I wish funding could be directed towards certain features. And that's why this issue can remain open until someone allocates his time to fix it.

dstillman commented 4 years ago

Just to be totally clear, though, this isn't something that just needs time allocated once and then would be fixed. It would need to be a permanent, ongoing effort to keep a self-hostable version current. We try to document necessary changes in commit messages, with scripts for DB upgrade steps, notes about new dependencies, etc., but any sort of update to a self-hosted version would just require a totally different process from what we do internally.

It'd be relatively feasible to build a container-based version of the server environment — we've experimented with that in the past — but once there are users and existing data, keeping it working and current and compatible with Zotero clients is a much more difficult proposition.

edgimar commented 3 years ago

It seems that what Zotero needs is a stable, well-documented API for the data-server. Honestly, in order to do good development, this API is probably already documented somewhere. Unless I'm mistaken, however, this API doesn't appear to be publicly available (the APIs published here are only for read-only access to data on the server). Correction: the API does support read/write operations.

With the full API publicly available, then anyone can choose to implement a data-server however they wish, so long as it complies with the API.

dstillman commented 3 years ago

@edgimar:

the APIs published here are only for read-only access to data on the server

Given that there are sections for “Write Requests”, “File Uploads”, and “Syncing”, I’m not sure what makes you think that.

edgimar commented 3 years ago

@dstillman, good question! 🙂 I didn't look at it closely enough - sorry for the confusion. It might be helpful to at least add a readme to this repo indicating that the server implements that API (and linking to it).

AntonOfTheWoods commented 3 years ago

It'd be relatively feasible to build a container-based version of the server environment — we've experimented with that in the past — but once there are users and existing data, keeping it working and current and compatible with Zotero clients is a much more difficult proposition.

This is the key realisation people need to make. Unless you have highly competent internal SRE/DevOps resources, running a proper container-based infrastructure can be highly challenging. Any script-kiddie can spin up a container on docker, but running and seemlessly upgrading complicated systems with 10s or even 100s of services on container orchestrators that are serving thousands/millions of users in a reliable manner is why good SREs get as much as good devs at Google & Co.

nickian commented 2 years ago

Just discovered this project and was wondering about self-hosting sync as well. Disappointed that I would have to rely on third-party servers. I will stick with Joplin. You should look at how they sync, where you can choose from several synchronization targets.

wanstr commented 1 year ago

I'm new to Zotero so any help is greatly appreciated. Now the issue I'm facing:

  1. I have a huge a mount of references I would like to add
  2. I also want to attach the pdf files
  3. I prefer not to store the files with Zotero

So it seems with 1 and 2 I should use the web api, but then with 3 my option is limited to using my own webdav. The web api doesn't seem to be able to upload to webadv. Solution?

dstillman commented 1 year ago

@wanstr: Please post all questions to the Zotero Forums.

fbievan commented 1 year ago

Just to be totally clear, though, this isn't something that just needs time allocated once and then would be fixed. It would need to be a permanent, ongoing effort to keep a self-hostable version current. We try to document necessary changes in commit messages, with scripts for DB upgrade steps, notes about new dependencies, etc., but any sort of update to a self-hosted version would just require a totally different process from what we do internally.

It'd be relatively feasible to build a container-based version of the server environment — we've experimented with that in the past — but once there are users and existing data, keeping it working and current and compatible with Zotero clients is a much more difficult proposition.

I think the best solution here is to leave people to figure out how to host the 'datacenter' (it's open source) (plus uniuuu seems to keep an up-to-date repo (with the hopes of merging it to the main zotprime) here

The the first step is to make sure to document how to set your own sync server (in the app) and what needs to be changed there. (but this isn't the repo for that).

webmind commented 10 months ago

Curious what the state is, working with organizations with sensitive content and not always an internet connection, zotero looks interesting, but having a self hostable server would be nice. Sysadmin/devops skills isn't much of an issue, but even they need documentation.

fbievan commented 10 months ago

Curious what the state is, working with organizations with sensitive content and not always an internet connection, zotero looks interesting, but having a self hostable server would be nice. Sysadmin/devops skills isn't much of an issue, but even they need documentation.

Honestly, the main problem for me has not been getting a server running (I mentioned that repo above, that uses docker).

But for me, it is getting the program to use the server. While you can compile the program yourself, and use it that way. I find that to be burdensome for deployment (for my personal use, atleast, as I won't get updates, not in repos)

flefevre commented 3 months ago

Does anyone have tested the following project

https://github.com/linuxserver/docker-zotero

It seems to be designed for it. But I didn't find any mention on how to change the zotero client.

frigvid commented 3 months ago

Unless I am mistaken, docker-zotero is for hosting the web client of Zotero. It is not relevant to self-hosting the dataserver, me thinks, since it presumably has the same issue as the desktop client where you need to compile it yourself to point it to the right server. ZotPrime V2 is probably the closest we have right now.