Closed rkarlsba closed 6 years ago
Assume we know nothing about LVM and care intensely about any performance change - slowdowns, extra memory usage etc. - and then persuade us this change is beneficial.
I assume you, as in the developers and users of raspbian, actually know what LVM is. It's been around for two decades or so and is widely used on all sort of linux machines.
These requests are read by many third parties. If you can't be bothered to write something...
Wikipedia does a reasonably good job of explaining the basics, so I won't waste space here quoting them: https://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux)
The short version is that LVM is a block layer that sits between the filesystems and physical hardware and allows arbitrary block mappings and various complex transformations to be applied to the data as it's transferred to the device, with the most notable features being:
The biggest potential benefit I can see to having it enabled by default on the Pi is that it would allow much easier conversion from a regular Raspbian image to an alternative boot setup using a USB device for the root filesystem (having LVM involved would mean this could be done online with zero down time using the pvmove
command to migrate the root volume to the flash drive).
As far as overhead, the dm-linear target (the backend used by LVM for regular logical volumes that just sit on a single device without any special stuff like thin provisioning) has very low overhead in most cases, though i can't comment on it's exact performance implications on a Pi.
Personally however, I'm actually against adding it to the default images because:
Instead, I think a better option is to make it easier to create Raspbian images with arbitrary storage configurations, which would in theory allow people who actually need LVM (or BTRFS, or F2FS, or bcache, or some other configuration) to more easily create such images themselves.
Thank you @Ferroin for a very complete, clear and persuasive post.
Well, it doesn't matter much, since it can be hidden easily just like today's root expansjon. But again, it can help users with a litt knowhow to do their things more easily.
Again, if the user doesn't know about filesystems, {s,}he won't mind if LVM's there
Can you point to one single incident where LVM failed on simple storage (that is, not RAID)? I've been using it for over a decade and haven't seen that yet.
But then, to create more images will probably be quite good. If someone is mad enough to use BTRFS, go on.
Oh - and btw, bcache, like flashcache and dm-cache, isn't a filesystem, it's a caching layer, and mostly superseded by lvmcache, which obviously involves lvm ;)
roy
Well, it doesn't matter much, since it can be hidden easily just like today's root expansjon. But again, it can help users with a litt knowhow to do their things more easily.
No, you really can't 'hide' LVM. It's either there or it isn't. The root expansion thing is a one-shot that runs the first time the image boots and then deletes itself, so it's not any kind of persistent thing like LVM is. And, in fact, LVM would make that more complex (right now, it just resizes the partition and the filesystem, with LVM it would have to resize the partition, the physical volume, the logical volume, and the filesystem).
Again, if the user doesn't know about filesystems, {s,}he won't mind if LVM's there
Until it breaks, at which point it's harder to fix.
Can you point to one single incident where LVM failed on simple storage (that is, not RAID)? I've been using it for over a decade and haven't seen that yet.
If you lose your volume group metadata or it becomes corrupted (which is absolutely possible with a single device), LVM breaks. It does at least support having multiple copies (though it defaults to one per PV), so it's a bit better in that respect than the old DOS partition tables we're stuck using for the SD card. Overall, this isn't very likely on systems using SSD"s or hard drives because they're generally very reliable (though I have seen bad sectors in the first metadata block in a PV cause LVM to freak out), but there have been issues in the past with certain combinations of SD cards and power supplies with the Pi causing random data corruption on the SD card.
But then, to create more images will probably be quite good. If someone is mad enough to use BTRFS, go on.
I wouldn't say 'mad' so much as 'doesn't care about performance'. BTRFS runs perfectly fine on a Pi, it's just slow (although it's not too bad if you're not writing much data to it, and most of the problem is how slow storage on the Pi is to begin with).
Oh - and btw, bcache, like flashcache and dm-cache, isn't a filesystem, it's a caching layer, and mostly superseded by lvmcache, which obviously involves lvm ;)
I didn't mean using bcache as a filesystem, I just meant having it in the storage layer. FWIW though, bcache almost is a filesystem internally, and there has been talk in the past about possibly adding a VFS interface to it.
Anyway - I don't know why some people hate LVM - it just works and has done so for years, and it really helps a lot for those of us who want to separate root and data without fiddling around with partition tables designed three decades ago. LVM adds flexibiliy, not complexity. Try to make a newbie resize a partition table without destroying his or her data.
I'm not saying I hate it. In fact, I use it on almost all of my systems, largely because of the flexibility it offers (I use BTRFS on top of it, so together I can literally reprovision the entire system online with zero downtime). The small number of systems I don't use it on are all cases where it just doesn't make sense to use LVM because I'm more likely to need to rebuild the system from scratch than change partition sizes (altogether, it's a half dozen minimalistic VM's, two VPS nodes, and a handful of Pi's that are all being used for IoT applications)
What I am saying is that I don't think it's the best idea to just have it by default as part of the system image.
As far as it supposedly not adding complexity, that's dependent on where you look. It does greatly simplify the process of reprovisioning a system, but only in very specific cases (namely, you aren't changing the size of any of the physical volumes at all, and you don't need to handle certain perfectly reasonable operations like reducing the size of a thin storage pool). It does however add complexity in a number of ways:
sfdisk
to adjust the partition table, and then a call to resize2fs
to resize the filesystem. With LVM, that would need to also include a pvresize
command and a lvresize
command. On top of that, the script has to handle the possibility of LVM not being involved (adding further complexity).While I think LVM is an awesome tool which I have been using for years on servers I don't think it's good for the pi. Just my 5c.
I think what we would need to consider, notwithstanding any technical pro's and con's as already discussed, prior to adding this is :
I'm surprised the hordes of people clamouring for LVM support haven't pooled their resources and created a build with LVM enabled.
Well, I'm not a developer, I'm just suggesting things to try to help things getting better.
Regarding the resize script, I'm not worried as much about the overhead (though it would make the resize take longer), I'm worried about there being two more places things can go wrong that need to have errors sanely handled. And yes, I know it's 'reliable', but that doesn't mean you shouldn't have proper error handling.
As far as just general performance overhead, it's small enough for simple linear mappings on at least x86 that you need to use the low-level kernel tracing to measure it. IIRC, it ends up being a couple function calls, a table lookup (to figure out the exact mapping for the required blocks) and some basic math. The memory overhead is likely to be the bigger issue (even linear mappings add measureable memory overhead), though that's harder to quantify exactly (because it's a lot more dependent on the exact layout of things on-disk).
pvresize /dev/blah lvresize -L +100% /dev/blah
those two will be finished in times mesured in milliseconds, maximum two digits
As for overhead, it didn't have much impact back when we had pentium 3 processors clocked at 500MHz. Yes, you can measure it, but then, you can measure a lot with a microscope.
I wonder why you are so afraid of using modern things like LVM. It's not a new filesystem, like btrfs, which is rather unstable in certain circumstances - it's a volume manager, and it's dead stable.
Have you considered a different distribution, one not aimed at education (hence the emphasis on simplicity, or at least minimising avoidable complexity)? Ubuntu Mate, openSUSE, etc. One of them must have LVM support.
To avoid anybody wasting time measuring performance impact and all that, for the sake of argument, let's grant that it's likely to be negligible and that there are some advanced use cases where LVM is more convenient. At this point, I still have no intention of switching the Raspbian builds to LVM.
There is absolutely no benefit to the target user base and great inconvenience to everybody who has gotten used to MBR. This would affect a ton of tutorials, books and forum posts. The reasons to switch have to be clear and to the benefit of the majority of the target user base. Even small changes result in a lot of questions on the forum and accusations that things are changing for the sake of changing.
Instead, I would prupose you write a script that takes an existing image and creates a copy based on LVM. If it turns out that that script becomes widely used, then we can revisit it.
@rkarlsba Have you been intentionally ignoring actual content in my comments on purpose?
pvresize /dev/blah lvresize -L +100% /dev/blah
those two will be finished in times mesured in milliseconds, maximum two digits
I already said I wasn't worried about the overhead of the command itself running:
Regarding the resize script, I'm not worried as much about the overhead (though it would make the resize take longer), I'm worried about there being two more places things can go wrong that need to have errors sanely handled. And yes, I know it's 'reliable', but that doesn't mean you shouldn't have proper error handling.
And then this:
As for overhead, it didn't have much impact back when we had pentium 3 processors clocked at 500MHz. Yes, you can measure it, but then, you can measure a lot with a microscope.
Just reiterated what I said here about performance impact:
As far as just general performance overhead, it's small enough for simple linear mappings on at least x86 that you need to use the low-level kernel tracing to measure it. IIRC, it ends up being a couple function calls, a table lookup (to figure out the exact mapping for the required blocks) and some basic math.
And completely ignored my (perfectly valid) comment later in the same paragraph about runtime memory usage (and yes, I know it's at most double digit kB, but that is still significant when you have only one GB of RAM).
I wonder why you are so afraid of using modern things like LVM. It's not a new filesystem, like btrfs, which is rather unstable in certain circumstances - it's a volume manager, and it's dead stable.
Which directly contradicts a comment I made much earlier:
I'm not saying I hate it. In fact, I use it on almost all of my systems, largely because of the flexibility it offers (I use BTRFS on top of it, so together I can literally reprovision the entire system online with zero downtime). The small number of systems I don't use it on are all cases where it just doesn't make sense to use LVM because I'm more likely to need to rebuild the system from scratch than change partition sizes (altogether, it's a half dozen minimalistic VM's, two VPS nodes, and a handful of Pi's that are all being used for IoT applications).
Now, I will admit I may not have made my argument quite as plain as I could have, but most of it is pretty much the same as the argument @XECDesign just made for not switching things. It doesn't benefit the intended user base for Raspbian while having a significant and likely negative impact on them.
"It doesn't benefit the intended user base for Raspbian while having a significant and likely negative impact on them." It benefits the ones that know Linux, and the ones to come to learn more, and it does not have a "significant and likely negative impact" on the other users. It's rock stable, as you may know if you're using it.
Just read through all of this issue again. I think its is agreed that is it
What hasn't been shown is the amount of impact and consequent work required to update all the documentation, and as the person responsible for that, with limited time, this requests borders on change for changes sake. The work required to implement pales in to insignificance to that required elsewhere.
So, for the moment, it seems unlikely we will be going down this route. So closing.
If Raspbian is meant to be a stepping stone towards the real world of Linux (and hi-grade Unix) in true business production and commercial must not fail cannot fail environments - i.e. the real world... then surely Raspbian has to catch up and deploy with LVM2 as standard. It just one layer of abstraction. Pretty much any real install of linux uses a decent volume manager of some sort.
@sdo101 You come from an entreprise and/or Fedora background, don't you? I hate to burst your bubble, but there's a whole lot of use cases that don't need or even want volume management. Not counting Android (yes, it is Linux, it's just a different userspace than you're used to), you've still got IoT devices (it's literally just overhead there, and any overhead is significant when you're running on a dinky little Cortex-M4 powered by a small battery), more conventional purpose-built devices (streaming devices, digital signage, etc, all cases where you have no expectation of ever reprovisioning), containers (they shouldn't be touching block storage directly at all, let alone doing volume management), properly handled virtual machines (you should be doing the volume management on the host system, not in the guest, and definitely not both places)), industrial control systems (again, tight embedded, any overhead is bad, plus one extra potential point of failure), systems for classic technophobe users (who will get exactly zero practical benefit from it, just like they get exactly zero practical benefit from Windows Storage Spaces), and a whole slew of other cases beyond that.
Realistically, there are far more Linux systems that do not use volume management than do out there.
You should reconsider using LVM by default. LVM is the better solution because it supports rollback through snapshots. Rollback is particularly useful when experimenting. If LVM is not the default, users will learn about an inferior solution to volume management.
You should reconsider using LVM by default. LVM is the better solution because it supports rollback through snapshots. Rollback is particularly useful when experimenting. If LVM is not the default, users will learn about an inferior solution to volume management.
I fully agree. Using filesystem directly on partitions is old-school these days. LVM is well documented and has minimal overhead, so low it's harldy possible to measure. It is, as mentioned, a well-known solution that many distros rely on as the default (such as rhel and centos). Understanding LVM will only help beginners, as it offers a lot of possibilities not available if using filesystems directly on partitions or disks.
Hi all
It would be nice to have the standard install on some future version (tomorrow? ;) ) on LVM by default instead of partitions. It would allow the user/admin more flexibility for the system, for obvious reasons. The overhead added is negligible anyway. Redhat/Centos has been using LVM by default for a decade or so, and it works well with a pi. Please allow to make this change.
roy