r-lib / ps

R package to query, list, manipulate system processes
https://ps.r-lib.org/
Other
75 stars 19 forks source link

ps_fs_info() slow on an old RHEL7 machine #175

Closed wlandau closed 1 week ago

wlandau commented 1 week ago

On that RHEL 7 machine, I am noticing slowness in ps_fs_info(). Do you know of a workaround to make it pretty much instantaneous on these old machines?

> system.time(info <- ps_fs_info(fs::file_create(tempfile())))
   user  system elapsed 
  0.061   0.008  30.269 
> info
                              path mount_point                     name type
1 /tmp/Rtmpnr3NbO/file7fcc3e829973        /tmp /dev/mapper/rootvg-tmplv  xfs
  block_size transfer_block_size total_data_blocks free_blocks
1       4096                4096          26211840    26185335
  free_blocks_non_superuser total_nodes free_nodes
1                  26185335    52428800   52426684
                              id owner  type_code subtype_code MANDLOCK NOATIME
1 02, fd, 00, 00, 00, 00, 00, 00    NA 1481003842           NA    FALSE   FALSE
  NODEV NODIRATIME NOEXEC NOSUID RDONLY RELATIME SYNCHRONOUS NOSYMFOLLOW
1 FALSE      FALSE  FALSE  FALSE  FALSE     TRUE       FALSE       FALSE

> system.time(info <- ps_fs_info("DESCRIPTION"))
info
   user  system elapsed 
  0.003   0.001  72.807 
> info
         path mount_point                 name type block_size
1 DESCRIPTION     /lrlhps nfs-ssd-1fs:/ifs/lrl  nfs     524288
  transfer_block_size total_data_blocks free_blocks free_blocks_non_superuser
1              524288        9279933319   766052930                 395218095
   total_nodes  free_nodes                             id owner type_code
1 3.330043e+12 4.35226e+11 00, 00, 00, 00, 00, 00, 00, 00    NA     26985
  subtype_code MANDLOCK NOATIME NODEV NODIRATIME NOEXEC NOSUID RDONLY RELATIME
1           NA    FALSE   FALSE FALSE      FALSE  FALSE   TRUE  FALSE     TRUE
  SYNCHRONOUS NOSYMFOLLOW
1       FALSE       FALSE
gaborcsardi commented 1 week ago

The problem is probably NFS. How fast is this?

ps::ps_disk_partitions()
wlandau commented 1 week ago

A couple milliseconds on NFS:

> microbenchmark::microbenchmark(ps::ps_disk_partitions())
Unit: milliseconds
                     expr      min       lq     mean   median       uq      max
 ps::ps_disk_partitions() 1.857288 1.910587 2.009274 1.938773 1.977369 3.923788
 neval
   100
gaborcsardi commented 1 week ago

How about ps::ps_disk_usage()?

wlandau commented 1 week ago

Same thing, ps::ps_disk_usage() is just under 2ms on my company's NFS RHEL7 setup.

gaborcsardi commented 1 week ago

After #176, you can call ps_fs_mount_point() to get the mount point for any (existing) file or directory, and then you can match that to the mountpoint column of ps_disk_partitions() to get the fs type. This should be much faster.

I can speed up ps_fs_info() a bit, but it is never going to be as fast as ps_fs_mount_point(). It is always going to be slow on (some?) NFS systems, I am afraid.

wlandau commented 1 week ago

Thanks, @gaborcsardi!

gaborcsardi commented 1 week ago

@wlandau If you can confirm that this works for you, I'll make a ps release.

wlandau commented 1 week ago

Both approaches we discussed work beautifully. https://github.com/ropensci/targets/pull/1326 uses ps_fs_mount_point() + ps_disk_partitions(), and it is working reliably and fast.