Open CSC-swesters opened 1 year ago
Nice to see that #3029 was fixed!
A few comments, perhaps some of these things could be split into separate issues. Feel free to do so, if it's appropriate!
It looks like https://github.com/OSC/ondemand/issues/3666 is a duplicate of this issue, in a sense.
The fix in PR #3654 would only fix the FIFO type, but not e.g. sockets or block & character devices. I would propose to improve the newly merged code, so that it checks whether a file is one of the following:
If the file is none of the above, it will not be eligible for download.
Even if unprivileged users cannot create block or character devices, it would be nice to have a complete solution which allows "known good" types of files, instead of prohibiting only a few select "known bad" file types (currently pipes).
What are your thoughts on this approach?
Sockets are still acting buggy. I can reproduce with a socket in OOD 3.1.7, using nc(1)
:
nc -U /path/to/mysocket -u -l
And try opening the file in OOD. It will at least fail loudly (pop-up + banner), but #3666 seems to aim to prevent these errors before a user attempts them. An interesting thing that happened is that OOD thought I wanted to navigate into a directory called mysocket
, which isn't correct behavior of course. If I select the socket and press the "Download" button, it will silently fail as described in #3029.
I'm thinking that the performance impact on the shared file systems especially would be nice to keep minimal. At least one should try to avoid doing more than one stat
system call (N.B. higher-level Ruby functions might do them in the background) per file. If it is necessary to check multiple things, it would be better to have helper methods for interpreting the stat results. I took a look at the PosixFile
code, and noticed that there are multiple deferred calls to, e.g., directory?
, pipe?
, and readable?
. If there are a lot of entries in a directory, and Rails doesn't do something clever in the backend, which I'm not aware of, it can be quite heavy to render the Files view.
I did a small benchmark with 10000 regular files inside a subdirectory and reloaded the Files app page a few times and collected the server load time (according to Firefox's "Waiting" statistic in request timings). With N=7 and discarding the highest and lowest numbers, the average came out to 6.46 seconds (N=5). For comparison, running time ls -l
inside the directory took around 0.77 seconds with hot caches (N=5). Note that the caches were hot for OOD's test case too.
If you wish to replicate the setup, I created it like this:
mkdir testdir-with-10k-files
cd testdir-with-10k-files
# Generate 10000 files with at most 4K random text in them
for i in $(seq 1 10000); do
filename="$(printf "%06d" "$i")-file" ;
dd if=/dev/urandom bs=4k count=1 2>/dev/null | tr -cd A-Za-z0-9 > "./$filename" ;
done
The "No data available in table" text is a bit misleading while waiting for big directory listings. It would be a better user experience to have have a spinner or the text "Loading..." or something, and only when the directory is empty, the text would say "No files in this directory".
I was lazy and put a lot of different thoughts into the same comment, so sorry for that! They all relate to the Files app, and I felt that it was important to touch on the different concerns, and surface a few ideas, in case it would affect the implementations!
I hope this helps :slightly_smiling_face:
No issues in long comments.
It would be a better user experience to have have a spinner or the text "Loading..." or something,
This is something I've thought too, though never made the ticket. I'll do so shortly.
I'm thinking that the performance impact on the shared file systems especially would be nice to keep minimal.
I'll see what I can come up with. Rails has benchmarking built in so that may prove useful here.
Does the change in #3718 solve all these cases? It seems to I think.
I can replicate the performance issues - but still tracking down a mechanism to profile the code paths.
Looking at the performance it appears to 2 things take most of the time - caching (which seems unfortunate and kinda defeats the purpose) and determining the 'human size'.
Looking at the performance it appears to 2 things take most of the time - caching (which seems unfortunate and kinda defeats the purpose) and determining the 'human size'.
Those are great findings, thanks for taking the time to benchmark! If I read the call graph correctly, the ActiveSupport::Cache::Store#fetch
is using 35% of the time to run PosixFile#ls
, and PosixFile#human_size
is taking 42% of the time?
What comes to my mind is whether the browser could be converting the size numbers to human_size instead, via Javascript? That would make the server side do a lot less work by just sending the number of bytes. Converting byte sizes into a table is probably well suited for the front-end code, I would say. It could be made optional (opt-in) if it turns out to add a lot of rendering time for the browser, but I suspect that the net difference would be small.
Does the change in https://github.com/OSC/ondemand/pull/3718 solve all these cases? It seems to I think.
I think it does, for regular files. The file?
function will return true
for symlinks to regular files as well. Are the directory downloads (zipping and downloading) handled differently? I'm not familiar enough with Rails to see if directories need to be included in the downloadable?
check :slightly_smiling_face:
I forgot to ask earlier what file system you used when running the benchmark? Was it run on your local machine's file system, or on a cluster?
I would have assumed that a shared filesystem would have more delays in metadata operations. We're using Lustre at our site, so the effect of unnecessary stat()
calls can be quite visible, in my own experience.
It was a local filesystem, we had a downtime yesterday. I can try again on an NFS, a NetApp and I'm not sure what the other is.
Sounds good! :+1: It's valuable to have both perspectives, i.e., where Rails can be optimized when we assume that the (cluster) file system has stellar performance, as well as how the time is spent when using an actual shared file system. The optimizations to Rails (caching and human_size) might have been lost in the noise if it was run on a slower file system.
I generated that benchmark with this gem: https://github.com/rails/rails-perftest/ Although you cannot use the gem from rubygems, you need to pull down the source code and add it to the Gemfile locally.
This is the test file I used to generate that image above
require 'test_helper'
require 'rails/performance_test_help'
class PosixFileTest < ActionDispatch::PerformanceTest
# Refer to the documentation for all available options #[:wall_time, :memory, :cpu_time, :process_time],
self.profile_options = { runs: 30, metrics: [:wall_time],
output: 'tmp/performance', formats: [:graph_html, :call_stack] }
def setup
@dir = '/tmp/file-profile-test'
Dir.mkdir(@dir)
Dir.chdir(@dir) do
(1...300).each { |num| `touch foo_#{num}.txt` }
end
@mem = ActiveSupport::Cache::MemoryStore.new
Rails.stubs(:cache).returns(@mem)
end
def teardown
FileUtils.rm_rf(@dir)
end
test "listing" do
f = PosixFile.new(@dir)
f.ls
end
end
The Files app works nicely with regular files and directories. However, there are a few other types of files as well, which don't really make sense to present like normal files in OOD.
Referring to the inode(7) manual, there are the following file
st_mode
s in Linux:Symbolic links could be supported fully, by allowing users to dereference the links to view, edit, or download the files. That currently happens implicitly in OOD 3.0.1, but it could be made more explicit perhaps, so that users are aware of the differences.
Other file types (i.e., sockets, block devices, character devices, and FIFOs) probably don't make sense to support. Currently OOD will try treating them like regular files apparently, but this fails. For example, reading from a FIFO by using "View", "Edit", or "Download" functionality throws an error, and the FIFO keeps blocking (as is probably to be expected). It is, however, possible to unlink the FIFO via OOD.
These file types could perhaps still be listed by name, but have their functionality blocked (except "Delete")? Let me know what you think.