openzfs / openzfs-docs

OpenZFS Documentation
https://openzfs.github.io/openzfs-docs/
135 stars 194 forks source link

Lustre with ZFS #403

Closed Sak0808 closed 1 year ago

Sak0808 commented 1 year ago

Hi

We are using zfs as backend file system with lustre. We are using zpool for the storage node failure. But when one node fails, any process running in the client stops and the system hangs. FYI This is our setup, and we are using the following steps.

We have a common storage, where all ssds are sent over network Storage IP 192.168.10.141

Mounting MDT zpool create -f tank1 /dev/nvme0n1 /dev/nvme1n1 zfs create -V 500gb tank1/vol mkfs.lustre --reformat --mdt --mgs --backfstype=zfs --fsname=lustre --mgsnode=192.168.10.141 --index=0 tank1/zd0 mount -t lustre tank1/zd0 /mnt/mdt

Mounting OST zpool create -f tank2 /dev/nvme2n1 /dev/nvme3n1 zfs create -V 2600gb tank2/vol mkfs.lustre --reformat --ost --backfstype=zfs --fsname=lustre --mgsnode=192.168.10.141 --index=0 tank2/zd16 mount -t lustre tank2/zd16 /mnt/ost/

Mounting client mount -t lustre 192.168.10.141:/lustre /mnt

gmelikov commented 1 year ago

Please use more appropriate channels for questions https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html https://github.com/openzfs/zfs/discussions next time.

On your question - ZFS itself won't give you any node failover functionality, please see https://wiki.lustre.org/Creating_Pacemaker_Resources_for_Lustre_Storage_Services and https://github.com/ewwhite/zfs-ha/wiki

I'll close this issue, but feel free to continue writing to appropriate channels.