Open TedBrookings opened 5 years ago
I asked your question to PAPI and here is the response:
This detail is not something that should be counted on in a containerized environment. That said: the /dev/disk/by-id/* system is simply a convenient alias. The underlying block storage doesn't change (eg, /dev/disk/by-id/google-local-disk is a symlink to a block device, in this case, /dev/sdb). So they should be able to continue monitoring if they want, it will just be harder to recover the mapping.
I've ran into the same issue, trying to measure disk I/O from monitoring_image
. Have you found a solution?
Not really, I have a work-around (if /sys/block/sdb/ is a directory and /dev/sdb is mounted in mtab, use /sys/block/sdb/)
function findBlockDevice() {
MOUNT_POINT=$1
FILESYSTEM=$(grep -E "$MOUNT_POINT\s" /proc/self/mounts \
| awk '{print $1}')
DEVICE_NAME=$(basename "$FILESYSTEM")
FS_IN_BLOCK=$(find -L /sys/block/ -mindepth 2 -maxdepth 2 -type d \
-name "$DEVICE_NAME")
if [ -n "$FS_IN_BLOCK" ]; then
# found path to the filesystem in the block devices. get the
# block device as the parent dir
dirname "$FS_IN_BLOCK"
elif [ -d "/sys/block/$DEVICE_NAME" ]; then
# the device is itself a block device
echo "/sys/block/$DEVICE_NAME"
else
# couldn't find, possibly mounted by mapper.
# look for block device that is just the name of the symlinked
# original file. if not found, echo empty string (no device found)
BLOCK_DEVICE=$(ls -l "$FILESYSTEM" 2>/dev/null \
| cut -d'>' -f2 \
| xargs basename 2>/dev/null \
|| echo)
if [[ -z "$BLOCK_DEVICE" ]]; then
1>&2 echo "Unable to find block device for filesystem $FILESYSTEM."
if [[ -d /sys/block/sdb ]] && ! grep -qE "^/dev/sdb" /etc/mtab; then
1>&2 echo "Guessing present but unused sdb is the correct block device."
echo "/sys/block/sdb"
else
1>&2 echo "Disk IO will not be monitored."
fi
fi
fi
}
I am not sure if this is a google VM problem, a docker problem, or a problem with how cromwell specifies volumes to docker; but I took their response to be "we don't care and won't fix it". Fortunately for me the work-around nearly always works for cromwell jobs.
@TedBrookings thanks, I was going to just use sdb/c/... (in the disk order for the task). When does it not work?
I'm using this in a fully automated setting, as part of this script: https://github.com/broadinstitute/dsp-scripts/blob/master/cromwell/methods/cromwell_monitoring_script.sh
So for me it won't work if the combination of google VM / docker / cromwell results in the disk not being mounted on /dev/sdb for any reason. This could happen if the user requests disks to be mounted in a specific place (https://cromwell.readthedocs.io/en/stable/RuntimeAttributes/), requests more than one disk but would prefer the second disk be monitored, or if cromwell starts using /dev/sdb for some other resource and the disks get pushed to /dev/sdc
I guess it would be more precise to say, "I'm not aware of this happening, but lots of people use this script and I have no idea if any of them would complain to me if it didn't work".
Interesting! I may have found another - deterministic - way, based on how it's done in gopsutil:
Find the st_dev
device attribute for the mount point in /proc/self/mountinfo
file,
which is "the most authoritative source to check your mounts" [1],
and is always present in modern Linux kernels [2].
Per [3], st_dev
Identifies the device containing the file. The st_ino and st_dev, taken together, uniquely identify the file. The st_dev value is not necessarily consistent across reboots or system crashes, however.
The format of mountinfo
, according to [2]:
3.5 /proc/<pid>/mountinfo - Information about mounts
--------------------------------------------------------
This file contains lines of the form:
36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root rw,errors=continue
(1)(2)(3) (4) (5) (6) (7) (8) (9) (10) (11)
(1) mount ID: unique identifier of the mount (may be reused after umount)
(2) parent ID: ID of parent (or of self for the top of the mount tree)
(3) major:minor: value of st_dev for files on filesystem
(4) root: root of the mount within the filesystem
(5) mount point: mount point relative to the process's root
(6) mount options: per mount options
(7) optional fields: zero or more fields of the form "tag[:value]"
(8) separator: marks the end of the optional fields
(9) filesystem type: name of filesystem of the form "type[.subtype]"
(10) mount source: filesystem specific information or "none"
(11) super options: per super block options
So for example, inside my task
grep cromwell_root /proc/self/mountinfo
904 885 8:16 / /cromwell_root rw,relatime master:325 - ext4 /dev/disk/by-id/google-local-disk rw
8:16
here is st_dev
, with
(3) major:minor: value of st_dev for files on filesystem
Now we look up major minor
in /proc/diskstats
[4]:
The /proc/diskstats file displays the I/O statistics
of block devices. Each line contains the following 14
fields:
1 - major number
2 - minor mumber
3 - device name
So e.g.
awk '$1 == 8 && $2 == 16 {print $3}' diskstats
sdb
The same approach works for non-/cromwell_root
mounts as well.
Obviously, we can fully automate this lookup in a script.
[1] https://serverfault.com/a/581180/296112
[2] https://www.kernel.org/doc/Documentation/filesystems/proc.txt
[3] https://www.gnu.org/software/libc/manual/html_node/Attribute-Meanings.html
[4] https://www.kernel.org/doc/Documentation/ABI/testing/procfs-diskstats
Backend: I'm testing out PAPI v2 by running on the cromwell 34 and 36 methods servers. Problem: inside the docker it looks like /cromwell_root is mounted on /dev/disk/by-id/google-local-disk (checking df -h, /proc/mounts, or /etc/mtab) but that device does not exist (in fact there is no /dev/disk directory). Background: This task requests a persistent HDD and runs inside a docker. This problem does not exist on cromwell 30 (with jes backend). /cromwell_root is almost certainly actually mounted at /dev/sdb (that device exists, does not appear to be used anywhere, has the appropriate size as checked in /sys/block/sdb/size, and is typically what's listed as the filesystem in cromwell 30).
I know it's weird to even care about that, so to explain, my cromwell monitoring script looks at the block device corresponding to /cromwell_root in order to measure disk IO, which can potentially be a source of problems for some of the SV algorithms we're trying to debug/string together.
the .wdl file
Snips of relevant output from cromwell 36 (edited for brevity):
Whereas on cromwell 30