Closed sscargal closed 1 year ago
That's a good idea, and should be relatively simple to implement. 👍
Coming from an operations perspective, can we also include shrinking pmem files? I can imagine its harder, as we'd need to move the live data back within the target smaller size like a defrag.
Another feature/option might be to allow adding (and removing a pool file to/from an existing poolset.
(Edit - dynamic adding is supported based on man page )
Shrink might be better as its own FEAT, as I agree the ability to autogrow is a nice feature.
Is there also a way to get the "current utilisation" of a pool from a function call within libpmemobj while running?
We might implement pool shrinking after we implement pmem/pmdk#4187. Right now it's just not realistic. And yes, you should be able to add parts to an existing poolset.
How would you define "current utilization"? % of occupied space?
In general yes. However with fragmentation it might be misleading too, so it can only be taken as an indicator.
Eg, If someone looks at their pool and it reports 10% free of a 100GB pool, they might expect 10GB contiguous free. Then a 5GB allocation fails because the largest contiguous space is only 4GB. Or they are confused why an autogrow was triggered when it seemed like there was sufficient space.
Another stat might "largest contiguous space"... might be hard or expensive to track though?
Ideally we could get the info from "pmempool info" through the API from within an app with the pool open. I understand pmempool info only works on offline pools at present. However, in future an app may want to provide an SNMP interface and/or send traps based on info from those stats.
Alternatively could pmempool info (and even check with no repair) be adapted to run in read-only mode against an open pool, and that way monitoring tools (Nagios, SCOM, SolarWinds etc) could write an agent for any pool.
I've been very reluctant to introduce any generic interfaces around space utilization for the reasons you list. Right now there's an API to retrieve the total size of allocated objects, but that might not be very useful.
Tracking largest free contiguous space - we could probably implement a rough estimate, but tracking this accurately would negatively impact scalability and overall performance. One statistic I can reasonably and efficiently expose is number of free/used/run chunks in the pool. This can be used to approximately deduce utilization for the pool. We could also track allocated/freed objects at the allocation class level.
Within this feature enhancement, we should support the AUTO
value for size within the poolset file when using FSDAX. Currently, directories support uses the
From poolset(5):
The size argument of a part in a directory poolset becomes the size of the address space reservation required for the pool. In other words, the size argument is the maximum theoretical size of the mapping. This value can be freely increased between instances of the application, but decreasing it below the real required space will result in an error when attempting to open the pool.
AUTO
works for devdax only.
Pools created on Device DAX have additional options and restrictions:
The size may be set to “AUTO”, in which case the size of the device will be automatically resolved at pool creation time.
In other words, the following myautogrowingpool.set
configuration fails:
PMEMPOOLSET
OPTION SINGLEHDR
AUTO /pmemfs0/
# pmempool create --layout="mylayout" obj myautogrowpool.set
error: 'myautogrowpool.set' -- directory based pools are not supported for poolsets with headers (without SINGLEHDR option)
error: creating pool file failed
But this works:
PMEMPOOLSET
OPTION SINGLEHDR
10GiB /pmemfs0/
# pmempool create --layout="mylayout" obj myautogrowpool.set
#
# ls -lh /pmemfs0/*.pmem
-rw-rw-r--. 1 root root 8.0M Aug 5 05:08 /pmemfs0/000000.pmem
So the proposed -M, --maxsize <size>
option should default to AUTO
if no value is provided and we can determine the available capacity within the FSDAX filesystem at the time of pool creation. If the file system fills up, we'll return ENOSPC when trying to grow.
@pbalcer How this feature is supposed to work?
As far I know autogrow is only supported for pooleset files. Should we create automatically a poolset in the given directory?
if yes, how i will open this pool? Should i use directory path as my pool or use poolset file created by pmempool(how a user will find it?)?
Not in the directory, but where the user specified. I think it should look like this:
pmempool create [<options>] [<type>] [<bsize>] <file>
Available options:
-a, --autogrow [directories ...]
Create an auto growing poolset
where the file is the poolset, so:
./pmempool create obj --autogrow /mnt/pmem --size=1GiB --maxsize=10GiB poolset.file
pmempool doesn't create poolset files by design. Autogrowing poolsets are no different than other types of poolsets, so I don't see why we should implement this particular feature, but leave other types out.
So the fundamental question is - what problem are we trying to solve here? The number of characters you have to type to create autogrowing poolset by hand would be similar to the number of characters needed to use this new feature, so this can't be just that...
I think the problem is discoverability of this feature. Most probably skip over the directories section or poolsets man page sections. Maybe we should add a command to create a poolset?
This improvement is not considered vital at the moment. So, we do not have the resources to fulfil your request. Sorry.
FEAT: 'pmempool create' should support creating auto-growing poolsets
Rationale
This feature request is different to the
pmempool resize
feature pmem/pmdk#4170 in that this one allows the user to create a poolset from the start that auto grows on demand to the limit of available space within the filesystem. Thepmempool resize
feature allows users to grow an existing single pool.At creation time, it is not always known how large a pool should be. The amount of data plus space to grow is a good starting point. Poolsets support the ability to dynamically grow (in directory mode) by adding small 128MB pools to an existing poolset. This requires manual administration to create the initial DIRECTORY based poolset. Currently,
pmempool create
does not support this.Description
It would be nice if
pmempool create
allowed the user to specify a base directory, initial size, optional max size, and growth chunk size.From poolset(5) - http://pmem.io/pmdk/manpages/linux/master/poolset/poolset.5
API Changes
pmempool needs to support files and directories as input arguments. Currently it assumes just a file.
Internally, no API changes should be needed. The feature is currently integrated into PMDK, we just need a user interface to set it up. The only caveat maybe if we want to support multiple directories or file systems to store the poolset parts in case one fills up.
Implementation details
The proposal would provide the following user command options and extend the use of existing ones (--size and --maxsize)
If we wanted to support concatenating or striping auto growing pools across multiple file systems, we should also allow this syntax:
Meta