ubuntu / zsys

ZSys daemon and client for zfs systems
GNU General Public License v3.0
303 stars 43 forks source link

Option to migrate USERDATA from pool #132

Closed mhalano closed 4 years ago

mhalano commented 4 years ago

Describe the bug I have a peculiar (but not that peculiar) situation which I didn't find help using zsys. I have a SSD (small) and also a HDD (big). The ideia is move all my user data from the rpool (created during installation on SSD) to another pool (created after installation on HDD).

To Reproduce Steps to reproduce the behavior:

  1. Install Ubuntu with ZFS
  2. Create a pool for the HDD
  3. Try to just set mountpoint option to /home
  4. Have problems between USERDATA location and HDD's pool location

Expected behavior Should be a way to migrate USERDATA from a pool to another

For ubuntu users, please run and copy the following:

  1. ubuntu-bug zsys --save=/tmp/report
  2. Copy paste below /tmp/report content:

The repor: https://pastebin.com/nCszaNUa https://pastebin.com/QTsnuhDM



**Screenshots**
If applicable, add screenshots to help explain your problem.

**Installed versions:**
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04 LTS"
zsysctl 0.4.5
zsysd   0.4.5

**Additional context**
Add any other context about the problem here.
didrocks commented 4 years ago

Hey, I’m unsure this is something zsys should grow an API for. However, ZSys supports USERDATA being on a different pool, so as long as you do the migration yourself and you import the secondary pool, you are fine.

I suggest:

If this is fine, I’ll close the bug.

mhalano commented 4 years ago

That's not so easy. The idea here is to have an option in zsysctl userdata create to use other pool than rpool. I'm migrating my data as suggest, but if I create a new dataset for another user the rpool is used instead of my userdata exclusive pool.

didrocks commented 4 years ago

Absolutely not! We took that use case into account :) If you delete rpool/USERDATA, then the new users will be created under secondarypool/USERDATA!

mhalano commented 4 years ago

Done. Didier, you rocks :) Thank you for your help.

jknockaert commented 4 years ago

Just elaborating on this specific setup (so with three pools: bpool, rpool and say hpool for the USERDATA dataset). @didrocks Is there a predetermined order in which the datasets are mounted? Obviously the bpool datasets come first. But what's next? I'm asking this as I want to use native encryption. If I know which encrypted dataset is mounted first (in a deterministic way) I can store the encryption key of the other pool (rpool or hpool) in a file in the first mounted dataset. (Obviously I do not want to store encryption keys in the unencrypted bpool.) That way I would have to manually enter only one encryption key upon boot...

mhalano commented 4 years ago

That's a great question. I would like to encrypt my SSD (/) and my HDD (/home) using ZFS encryption in a easy way. For now I'm not using encryption because I couldn't find the solution for this problem.

On Wed, Aug 12, 2020, 4:21 PM Jasper Knockaert notifications@github.com wrote:

Just elaborating on this specific setup (so with three pools: bpool, rpool and say hpool for the USERDATA dataset). @didrocks https://github.com/didrocks Is there a predetermined order in which the datasets are mounted? Obviously the bpool datasets come first. But what's next? I'm asking this as I want to use native encryption. If I know which encrypted dataset is mounted first (in a deterministic way) I can store the encryption key of the other pool (rpool or bpool) in a file in the first mounted dataset. That way I have to manually enter only one encryption key upon boot...

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/ubuntu/zsys/issues/132#issuecomment-673062365, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABN66R6HZVFA3LYV43XR3ZLSALTSBANCNFSM4NKL6D4A .

didrocks commented 4 years ago

@didrocks Is there a predetermined order in which the datasets are mounted? Obviously the bpool datasets come first.

There is no pre-defined order as we let systemd doing the actual mount. Here is what happens:

jknockaert commented 4 years ago

There is no pre-defined order as we let systemd doing the actual mount.

OK, so it basically means that when using encrypted datasets you will have to enter the key manually at least once per encrypted pool (with the pool root as encryption root) as you cannot rely on any encrypted dataset already be mounted. Or would it be possible the use the rpool "/" dataset (wich is mounted when booting starts) to store encryption keys for any other pool with encrypted datasets?

Thanks for the clarification provided.

jknockaert commented 4 years ago

@didrocks I'm trying to fully understand the boot mechanism and how it interacts with zfs native encryption.

* initramfs import both bpool and rpool.

* rpool "/" dataset is mounted and then pivot-root is called. This is when thus that the system starts booting.

* systemd zfs generator creates a .mount files for any mountpoint (/boot and all /var, etc.) and systemd mounts them in parallel.

So my understanding is that in the second step (mounting the rpool "/" dataset) the dataset with property com.ubuntu.zsys:bootfs=yes is mounted. In case that dataset is encrypted a prompt is triggered to enter the encryption key. After enterring the key, the encryption key is loaded, and when the rpool root is the encryptionroot (as is the usual case) all encrypted datasets on rpool using the same encryptionroot can be mounted without further user interaction. Mounting any other datasets (with a different encryptionroot) will happen elsewhere and will be triggered by systemd. That may work or fail, depending on how systemd handles this. Is my understanding so far correct?

As a result it seems advisable to store any other encryption keys in the rpool "/" dataset (and set keyformat to file for these other encryptionroots), as that one is guaranteed to be mounted at boot, and you won't depend on any other mechanism to properly handle the mounting of encrypted datasets (with keyformat set to pass) for which the encryptionroot key is not yet loaded. Is that correct?

didrocks commented 4 years ago

So my understanding is that in the second step (mounting the rpool "/" dataset) the dataset with property com.ubuntu.zsys:bootfs=yes is mounted. Yes (all datasets with that property are mounted).

After enterring the key, the encryption key is loaded, and when the rpool root is the encryptionroot (as is the usual case) all encrypted datasets on rpool using the same encryptionroot can be mounted without further user interaction. Mounting any other datasets (with a different encryptionroot) will happen elsewhere and will be triggered by systemd. That may work or fail, depending on how systemd handles this. Is my understanding so far correct?

Exactly.

As a result it seems advisable to store any other encryption keys in the rpool "/" dataset (and set keyformat to file for these other encryptionroots), as that one is guaranteed to be mounted at boot, and you won't depend on any other mechanism to properly handle the mounting of encrypted datasets (with keyformat set to pass) for which the encryptionroot key is not yet loaded. Is that correct?

Yes, it seems to me to be the most sensible (without using a luks key store) setup to keep very close to upstream philosophy.

gutleib commented 3 years ago

I suggest:

* create your pool2/USERDATA/ with the same dataset name than on the first pool

* ensure you have the same properties and **user properties** set on that dataset

* do a zfs send/recv between the 2 datasets
  Then, once ready, remove the second dataset.

Hi! Did exactly that, but, as suggested further, systemd fails mounting other datasets. Worse even, it Ubuntu silently creates a user folder, but with root owner. Since this is the only place such usecase was discussed, I thought, maybe you could point where to post bugreport... That's Systemd, but it's parts related to ZFS.

And regarding

Yes, it seems to me to be the most sensible (without using a luks key store) setup to keep very close to upstream philosophy.

-- I strongly disagree, because without multiple keys for one ZFS encrypted dataset this would make said hpool useless without rpool (as in RAID0).