blevesearch / bleve

A modern text/numeric/geo-spatial/vector indexing library for go
Apache License 2.0
10.11k stars 686 forks source link

ZFS Support #1589

Open dl-lim opened 3 years ago

dl-lim commented 3 years ago

https://github.com/go-gitea/gitea/issues/15450

Hi, I was brought here due to an error on my system. This issue may be larger than this repo alone, so appreciate your help in pinpointing which software layer is throwing this error.

mschoch commented 3 years ago

I was not aware of any limitation preventing the use of Bleve with ZFS.

So, first I think go-gitea still uses v1 of Bleve, with it's older upsidedown index format. This uses BoltDB storage, which uses OS locking primitives to ensure exclusive access by a single writer. I searched, but did not find other reports of problems with BoltDB on ZFS.

Second, in the linked ticket, you also mention issues with running ES on ZFS, however I could not find confirmation of known problems with ES or Lucene on ZFS either. In fact I found blog posts with tips on how to tune it, suggesting that it does work.

@alderson59 if you can more info about these issues let me know. Otherwise, we probably just have to set something up and try it, but it's not a very high priority for us right now.

dl-lim commented 3 years ago

@mschoch Many thanks for taking a look at this.

For some context on the configuration: I have my Ubuntu VM set up to connect to a ZFS storage via virtio-9p. I've had problems with several repositories using ES if it is installed on the external ZFS storage. I normally work around it by keeping the ES directories within the VM to avoid such an issue.

So, with go-gitea, I adopted the same temporary workaround, which works but is not ideal since VM storage is highly limited.

I cannot comment on go-gitea's version of Bleve, but I shall keep monitoring their updates.

I do next to nothing on my OS's configuration files, except for mounting the drives, so everything should be pretty default. Most of the configuration is done on the individual docker containers anyway, since that's where it's really needed.

I'm more than happy to provide more info, though, I'm not sure what is needed. Hardware? OS version? Let me know, and I'm more than happy to put it out there for posterity :)

codewinch commented 2 years ago

@alderson59 do you have multiple readers/writers (multiple containers) to your ZFS mount at the same time? It sounds like that could be the case from your other Gitea ticket, and that could definitely lead to corruption. Even if it's not multiple containers, are you 100% certain that another process on your host isn't messing with your filesystem while the docker container is also trying to do something with it?

What sync mode are you using on ZFS? ZFS is not fully POSIX compliant (see the warning at the top of https://zfsonlinux.org/manpages/0.7.13/man8/zfs.8.html ), so locks may not occur the way you expect.

sync = standard | always | disabled _Controls the behavior of synchronous requests (e.g. fsync, ODSYNC) standard is the POSIX specified behavior of ensuring all synchronous requests are written to stable storage and all devices are flushed to ensure data is not cached by device controllers (this is the default) always causes every file system transaction to be written and flushed before its system call returns. This has a large performance penalty. disabled disables synchronous requests. File system transactions are only committed to stable storage periodically. This option will give the highest performance. However, it is very dangerous as ZFS would be ignoring the synchronous transaction demands of applications such as databases or NFS. Administrators should only use this option when the risks are understood.

If you are experiencing the same issue on ES, then the issue is probably not with Bleve or ES, and you might want to close these tickets, since it's probably the container configuration or ZFS configuration, which might be more sensitive to locks.