Closed jwodder closed 3 months ago
@yarikoptic Please answer my questions above.
in general I wouldn't mind you choosing the way, but let's me make decision on the first way:
If implementing as
dandisets-healthstatus
subcommands:
- What subcommands? Should there just be one subcommand that does all the benchmarking at once (mount mounts, run & time tests)?
yes, let's call it run_benchmarks_across_mounts
(or choose a better one)
Do we need (as suggested in the original issue) a
run_benchmarks
command that just runs & times the tests?
yes, I think so. Should take a path to operate from. this way we could test against some custom mounted filesystems etc
Should there be dedicated subcommands for mounting each of the three mount types and unmounting once the user hits Ctrl-C?
I guess it might come handy to troubleshoot etc, but I didn't envision it... So something like shell_under_mount
?
Perhaps one subcommand that mounts a single mount type specified on the command line, runs & times the tests, and then unmounts?
could be (run_benchmarks_on_mount
), or could be just an option --mounts TYPE1,TYPE2,...
for run_benchmarks_across_mounts
which could be used to limit to 1 or more
@yarikoptic I still can't decide whether this should be subcommands of dandisets-healthstatus
, a separate script that depends on dandisets-healthstatus
, or a separate script that copies code from dandisets-healthstatus
. If I go with one of the latter two options, would you still want the script to have all of the subcommands that you described in your last comment? Would you still want some subcommands added to dandisets-healthstatus
?
Implementing as dandisets-healthstatus
subcommands:
dandisets-healthstatus
to depend (in the non-packaging sense, at least) on dandi-webdav
dandisets-healthstatus
? I don't think its API can be considered sufficiently stable (cf. dandi/dandi-webdav#4).dandisets-healthstatus
be required to manually install dandi-webdav separately and pass its path and/or other details to dandisets-healthstatus
?dandisets-healthstatus
with anything other than the fastest option, so we'd be adding code just to remove much of it later.Implementing as an independent script:
dandidav
without needing to declare it as a dependency, and the odds of the dandidav
API getting out of sync with the invocations in the benchmarking script are lowereddandisets-healthstatus
isn't burdened with any solely-experimental codedandisets-healthstatus
dandidav
package be moved into a subdirectory? Should the benchmark script be a package as well?dandisets-healthstatus
as a (packaging) dependency: Any future change to dandisets-healthstatus
will likely break the script. One option to address this would be to include a Git commit hash in the benchmarking script's requirements specifier for dandisets-healthstatus
, but then the benchmarking script won't get any benefits that may come from future updates to dandisets-healthstatus
.dandisets-healthstatus
: Code duplicationOther concerns:
run.sh
in the dandisets-healthstatus
repository), but is that really much better than hardcoding them into the script?Cons:
Causes
dandisets-healthstatus
to depend (in the non-packaging sense, at least) on dandi-webdav
- Should dandi-webdav be added as a packaging dependency of
dandisets-healthstatus
?
only for some extra_depends, e.g. [benchmark-backends]
or alike, not generally since not needed.
I don't think its API can be considered sufficiently stable (cf. Support configuration via the command line #4).
- Should the user of
dandisets-healthstatus
be required to manually install dandi-webdav separately and pass its path and/or other details todandisets-healthstatus
?
I think that is ok for now, but could as well just be listed in [benchmark-backends]
extra-depends.
overall -- I do not see major cons stated for this one.
As for independent script, I would imagine (if coded exposing some general interface) the pros would be: could be used by others to test some other operations, not necessarily healthchecks but may be some "real" analysis functions on data from the archive, e.g. run-banchmarks-sweep -d /tmp/fuse-mount-here my_analysis_script /tmp/fuse-mount-here/000003/myfile.nwb
and see which backend would be the best fit.
Some details — like the location of the davfs2 & webdavfs binaries, the command to run to unmount webdavfs, and the directories at which to mount things — could be hardcoded into the script, but that's bad practice.
can't we just rely on them to be in the PATH, and then which
to get absolute one for sudo
(if that is needed/concern)?
@yarikoptic
can't we just rely on them to be in the PATH, and then
which
to get absolute one forsudo
(if that is needed/concern)?
/opt/webdavfs/webdavfs
. Are you going to add the binary to some directory in the default PATH
, or will users of the script on smaug have to adjust their PATH
before running the benchmarks?sudo /usr/local/sbin/unmount-tmp-fuse
. My assumption was that you only granted my smaug account sudo
permissions for that one script and that I can't do sudo umount <whatever>
, so either the benchmarking script has to be hardcoded to use unmount-tmp-fuse
or it needs a CLI or config option to tell it whether to use umount
or unmount-tmp-fuse
.
- You stated earlier that you had installed webdavfs on smaug at
/opt/webdavfs/webdavfs
. Are you going to add the binary to some directory in the defaultPATH
, or will users of the script on smaug have to adjust theirPATH
before running the benchmarks?
added symlink to it now under /usr/local/bin
- You also stated that unmounting of webdavfs should be done via
sudo /usr/local/sbin/unmount-tmp-fuse
. My assumption was that you only granted my smaug accountsudo
permissions for that one script and that I can't dosudo umount <whatever>
, so either the benchmarking script has to be hardcoded to useunmount-tmp-fuse
or it needs a CLI or config option to tell it whether to useumount
orunmount-tmp-fuse
.
yeah -- might need per setup/filesystem type custom unmount operation unfortunately seems to me.
@yarikoptic I've decided to implement this as dandisets-healthstatus
subcommands. Please move this issue to the dandi/dandisets-healthstatus
repository.
@yarikoptic How exactly should mounting & unmounting with webdavfs work? Based on its README, the recommended way to use webdavfs is to install it at /sbin/mount.webdavfs
and run sudo mount -t webdavfs $URL $MOUNT_POINT
, but webdavfs on smaug is currently installed at /usr/local/bin/webdavfs
and run directly. (This discrepancy may be why running umount
after stopping the program is currently necessary.)
webdavfs
mounted for me without sudo
via ./webdavfs http://localhost:8080 /tmp/dandiarchive-fuse
which was great. It is indeed for unmounting I found no way to perform it without sudo
hence a helper.
Just for consistency, I did symlink mount it as /usr/local/sbin/mount.webdavfs
so you could use it with mount
but you would need sudo
for that, whenever it works fine without sudo if just used as a binary:
$> mount -t webdav http://localhost:8080 /tmp/dandiarchive-fuse
mount: /tmp/dandiarchive-fuse: must be superuser to use mount.
$> webdavfs http://localhost:8080 /tmp/dandiarchive-fuse
http://localhost:8080: no PUT Range support, mounting read-only
@yarikoptic You didn't answer my question: Which commands should I use for mounting & unmounting webdavfs?
sudo
solely for a specific command with specific arguments.
- I would argue that invoking the webdavfs binary directly is the wrong thing to do.
why? FWIW it is exactly the same binary used.
- I'm quite certain that you can give me permission to run
sudo
solely for a specific command with specific arguments.
It would need to be mount
command I guess... and umount
. Anything else?
I know that you are sane and I can trust you, so can do it but it would still raise some level of paranoia in me regardless ;-)
I know that you are sane and I can trust you, so can do it but it would still raise some level of paranoia in me regardless ;-)
I even immediately came up with a recipe for disaster:
mount
that partition and run the bad load....didn't try. but I wonder if smth like that could happen from FUSE filesystem - i.e. could there be root suid'ed content
@yarikoptic When I said "you can give me permission to run sudo
solely for a specific command with specific arguments," I meant that you can give me permission to run, say, sudo mount -t webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse
but not permission to run sudo mount <anything else>
.
via wrapper scripts I guess -- yeah, we could do that similarly to that unmount command, no problem.
@yarikoptic No, not via wrapper scripts (That would just obscure what's going on to readers of the dandisets-healthstatus
code). I mean that you can add the following to the sudoers file (SYNTAX NOT CONFIRMED; CONFIRM BEFORE USING):
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/mount -t webdavfs http\://127.0.0.1\:8080 /tmp/dandisets-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/umount /tmp/dandisets-fuse
and then I, via dandisets-healthstatus
, can run the exact commands given there via sudo
, but I won't be able to run anything else via sudo
.
cool, I didn't know I can specify full command invocations in sudoers.
added now those two to try out.
@yarikoptic
When running the subcommand for timing each test under each mount type, how should the asset(s) to test be passed on the command line? They can't be passed as file paths, as the fusefs mount and the WebDAV mounts use different path structures ({dandiset_id}/{asset_path}
vs. {dandiset_id}/draft/{asset_path}
). One idea would be for the command-line syntax to be <dandiset-id> <asset-path1> <asset-path2> ...
, but this wouldn't let you test assets from multiple Dandisets at once.
How should timing results be reported? Should they just be given in log messages after each test, or should there be a summary in some format after everything is run?
<dandiset-id>/<asset-path>
-- split on first /
and have the pair.@yarikoptic
but we should return list of records (or a dict)
The benchmarking is invoked as a CLI command, not a Python function. Commands can't return lists.
Then "visualization" summary based on those to tell the winner(s) among FUSE solutions.
What sort of visualization?
but we should return list of records (or a dict)
The benchmarking is invoked as a CLI command, not a Python function. Commands can't return lists.
I meant internally.
Then "visualization" summary based on those to tell the winner(s) among FUSE solutions.
What sort of visualization?
I mean a text summary display in that CLI command at the end. Overall on above two points - just follow the classical MVC design pattern and have that model (structure of results) and view (CLI summary) with controller (benchmarking code). This way later on we can more easily change rendering or add another usage/visualization (e.g. store + summary over different runs etc).
@yarikoptic If a test fails, should it be included in the visualization? What if a test is killed due to exceeding the one-hour timeout?
hm... I think any fail should be treated as an error in the case of benchmarking and would need to resolve it first.
@yarikoptic What about timeouts?
error out if timeout happens I think
@yarikoptic Is davfs2 currently set up so that I can do sudo mount -t davfs2 http://localhost:8080 /tmp/dandiset-fuse
, or is the proper command something else?
we have 1.6.1-1 installed, upstream has 1.7.0. I filed request for update: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1060078 but might just NMU it later although I do not expect any performance changes there judging from changelog
@yarikoptic Please append the following lines to /etc/davfs2/davfs2.conf
so that mount
doesn't prompt for a username & password:
[/tmp/dandisets-fuse]
ask_auth 0
@yarikoptic Also, did you install webdavfs as mount.webdavfs
anywhere? When I tried running sudo mount -t webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse
on smaug, I got:
mount: /tmp/dandisets-fuse: unknown filesystem type 'webdavfs'.
dmesg(1) may have more information after failed mount system call.
EDIT: According to the mount(8)
manpage, -t
binaries are only looked up in /sbin
, but you've installed it at /usr/local/sbin/mount.webdavfs
.
@yarikoptic Please append the following lines to
/etc/davfs2/davfs2.conf
so thatmount
doesn't prompt for a username & password:[/tmp/dandisets-fuse] ask_auth 0
uncommented existing one and changed to 0, but didn't add that path limiter... try
@yarikoptic Also, did you install webdavfs as
mount.webdavfs
anywhere? When I tried runningsudo mount -t webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse
on smaug, I got:mount: /tmp/dandisets-fuse: unknown filesystem type 'webdavfs'. dmesg(1) may have more information after failed mount system call.
it is there
smaug:/mnt/btrfs/scrap
$> ls -l /usr/local/sbin/
total 20
-rwx------ 1 root staff 1647 Jul 2 2015 btrfsQuota*
-rwxr-xr-- 1 root adm 86 Jan 31 2015 flush-caches*
-rwx------ 1 root root 88 Dec 12 2018 flush_caches_kyle*
lrwxrwxrwx 1 yoh staff 23 Jan 17 13:16 mount.webdavfs -> /usr/local/bin/webdavfs*
-rwsr-xr-x 1 root root 43 Jan 5 12:11 unmount-tmp-fuse*
-rwxr-xr-x 1 root root 3636 Dec 18 2014 zfs-monitor.pl*
$> ls -l /usr/local/bin/webdavfs
lrwxrwxrwx 1 yoh staff 22 Jan 10 09:15 /usr/local/bin/webdavfs -> /opt/webdavfs/webdavfs*
$> ls -l /opt/webdavfs/webdavfs
-rwxr-xr-x 1 yoh yoh 8021561 Jan 5 12:02 /opt/webdavfs/webdavfs*
and indeed odd since shell does find it
smaug:/mnt/btrfs/scrap
$> sudo mount -t webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse2
mount: /tmp/dandisets-fuse2: unknown filesystem type 'webdavfs'.
dmesg(1) may have more information after failed mount system call.
$> sudo which mount.webdavfs
/usr/local/sbin/mount.webdavfs
dunno... try to figure it out, if not -- there is mount.webdavfs http://127.0.0.1:8080 /tmp/dandisets-fuse2
EDIT: According to the
mount(8)
manpage,-t
binaries are only looked up in/sbin
, but you've installed it at/usr/local/sbin/mount.webdavfs
.
oh... ok... hate to do that for local installs, but will do for uniformity (anyways will need to package the damn thing if it ends up to be the winner ;-) )
@yarikoptic I can get mount -t webdavfs ...
to run successfully now, but I get a permissions error when trying to look inside /tmp/dandisets-fuse
(and even when just doing ls -l /tmp
). If you pass -o allow_other
to the mount
command, are you able to traverse the mount directory without being root? If so, please add that option to the allowed sudo
command.
we have now
smaug# cat /etc/sudoers.d/fuse
# For benchmarking FUSE
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/local/sbin/unmount-tmp-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/mount -t webdavfs http\://127.0.0.1\:8080 /tmp/dandisets-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/mount -t davfs http\://127.0.0.1\:8080 /tmp/dandisets-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/mount -t webdavfs -o allow_other http\://127.0.0.1\:8080 /tmp/dandisets-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/mount -t davfs -o allow_other http\://127.0.0.1\:8080 /tmp/dandisets-fuse
jwodder ALL=(ALL:ALL) NOPASSWD: /usr/bin/umount /tmp/dandisets-fuse
FWIW -- tried webdavfs on drogon but fail to get content of .zattrs:
dandi@drogon:/mnt/backup/dandi$ webdavfs -o ro -D http://dandi.centerforopenneuroscience.org dandidav-webdavfs
...
dandi@drogon:/mnt/backup/dandi/dandidav-webdavfs/zarrs$ cat 001/e3b/001e3b6d-26fb-463f-af28-520a25680ab4/326273bcc8730474323a66ea4e3daa49-113328--97037755426.zarr/.zattrs
cat: 001/e3b/001e3b6d-26fb-463f-af28-520a25680ab4/326273bcc8730474323a66ea4e3daa49-113328--97037755426.zarr/.zattrs: Input/output error
...
dandi@drogon:/mnt/backup/dandi/dandidav-webdavfs/dandisets/000108/draft$ head -n 2 dataset_description.json
head: error reading 'dataset_description.json': Input/output error
so not sure if that works at all now :-/
the same for davfs2... may be none of those supports redirects? since seems to be ok for dandiset.yaml
dandi@drogon:/mnt/backup/dandi$ cat /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/samples.tsv
cat: /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/samples.tsv: Input/output error
dandi@drogon:/mnt/backup/dandi$ head /mnt/backup/dandi/dandidav-davfs2/dandisets/000108/draft/dandiset.yaml
id: DANDI:000108/draft
doi: 10.80507/dandi.123456/0.123456.1234
url: https://dandiarchive.org/dandiset/000108/draft
name: Light sheet imaging of the human brain
dang..
@yarikoptic davfs2 does support redirects, but it has to be enabled by adding follow_redirect 1
to /etc/davfs2/davfs2.conf
. With this, I can access files under /zarrs/
but still not under /dandisets/
(maybe it doesn't support double-redirects?).
Also, I found this davfs2 issue that may be relevant to what we're doing: Version 1.7.0 much slower than 1.6.1 (a hundred times slower)
@yarikoptic I filed a bug report with davfs2 about lack of double-redirect support, but it doesn't look like the maintainers are actively handling bugs lately.
@yarikoptic webdavfs doesn't support redirects at all; I filed an issue with it requesting support: https://github.com/miquels/webdavfs/issues/30
Per brief discussion during our CON meetup today, just a note here (ping @jwodder) that we need to include in comparison our datalad-fuse solution (described in original post), so we make an informed decision on what backend to use for the healthstatus (currently datalad-fuse is used).
(Copied from https://github.com/dandi/dandi-infrastructure/pull/164#issuecomment-1875681628 et sequentes)
A script should be written to run & time the following tests:
pynwb_open_load_ns
fromdandisets-healthstatus
matnwb_nwbRead
fromdandisets-healthstatus
dandi ls
(to load metadata) on a single local assetThese should be run with
DANDI_CACHE=ignore
set in order to avoid any possible caching side effects from fscacher.The tests should be run on assets mounted using each of the following methods:
datalad-fuse
The assets to test should be one (or more?) sample assets of some "typical" size (a few GBs).
sub-mouse1-fni16/sub-mouse1-fni16_ses-161228151100.nwb
in 000016 is suggested as a possible candidate.Testing should be run on smaug.
Webdavfs has been installed on smaug at
/opt/webdavfs/webdavfs
.umount
as root.sudo /usr/local/sbin/unmount-tmp-fuse
can be run to forcibly unmount/tmp/dandisets-fuse
.davfs2 is currently installed on smaug both system wide and (for a more recent version) at
/opt/davfs2/DESTDIR/usr/local/sbin/
(?), but @yarikoptic reports issues with getting it to work.@yarikoptic Question: Should the script be standalone or implemented as one or more subcommands of
dandisets-healthstatus
?If implementing as
dandisets-healthstatus
subcommands:What subcommands? Should there just be one subcommand that does all the benchmarking at once (mount mounts, run & time tests)? Do we need (as suggested in the original issue) a
run_benchmarks
command that just runs & times the tests? Should there be dedicated subcommands for mounting each of the three mount types and unmounting once the user hits Ctrl-C? Perhaps one subcommand that mounts a single mount type specified on the command line, runs & times the tests, and then unmounts?If the benchmarking is to be implemented as part of
dandisets-healthstatus
, this issue should be moved to that repository.If implementing as a separate script, the script will need to either use
dandisets-healthstatus
as a dependency or else copy essential parts of its code.If
dandisets-healthstatus
is used as a dependency, then since the benchmarking script will be separate from it, this comes with the risk that any future change todandisets-healthstatus
will break the script. One option to address this would be to include a Git commit hash in the benchmarking script's requirements specifier fordandisets-healthstatus
, but then the benchmarking script won't get any benefits that may come from future updates todandisets-healthstatus
.If we do this, I assume the script should be saved in this repository?