Add C API - Githubissues

droundy commented 8 years ago

I think there is a desire to add a C API, and I'd like to unify bigbro with fsatrace (under either one name or the other). How does the bigbro API look?

https://github.com/droundy/bigbro/blob/master/bigbro.h#L3

I could easily see creating more a fine-grained set of output (e.g. separating stat into a separate array), and also allowing null pointers for output that is not desired.

I don't know how portable to windows the file descriptor approach is for redirecting stdout and stderr. Also, this API doesn't support setting the environment for the child, so if that is important, we'd need another argument. Finally, returning the child PID seems important in terms of fac's usage, but I'm not sure how that will work on windows. Maybe we create a second helper function kill_children? Sounds racy.

Another question is how to support the "blocking" mode that Neil wants.

jacereda commented 8 years ago

I'm not sure about separating the outputs by type. fsatrace returns instead a sequence of operations that reflects the temporal ordering. It's true that this temporal ordering can be non-deterministic in multi-threaded programs, but still I think it could be useful in some situations.

The way fsatrace keeps the temporal ordering is by appending to a shared-memory buffer using atomic operations. The internal representation is the same used in the output.

In any case, going from the sequence of operations to a bucketed representation is easy and could be implemented on top of the other if you prefer that API.

jacereda commented 8 years ago

@ndmitchell wanted a callback for the blocking mode. Since we are in another process, it would need to signal the invoking process by means of a shared semaphore and block until the callback is handled.

Going to the extreme, every operation could just invoke a callback that way, but I think that's not an option due to the amount of context switches.

jacereda commented 8 years ago

As for stdout/stderr redirection, I would try to keep them separated. I'm using different colours for stdout/stderr in my build system.

droundy commented 8 years ago

The advantage of a set as output rather than a sequence as output is that it doesn't scale in the same way with the side of the job, eg if it opens the same file many times, one doesn't have linearly growing data use. It's probably not important, though.

The other advantage to my mind of set output is that the caller doesn't have to do the fiddly handling of directory renames, eg to find out what files were created.

droundy commented 8 years ago

So we could have two file descriptors, one for stdout, and the other for street? I don't like keeping them separated because then you lose the ordering between them, but I'm fine with giving that as an option.

The big question in my mind is whether we can make redirection portable to Windows.

jacereda commented 8 years ago

What if the tool always did the redirection and reported as o|<some stdout message> & e|<some stderr message> ?

jacereda commented 8 years ago

Those would probably need to embed the size of the message to make parsing easier and avoid escaping...

droundy commented 8 years ago

Always redirecting stdout and stderr to different locations means that it is never possible to get correct synchronized output, so that isn't a good option at all. I wish that tools would decide to only output to either stdout or stderr, but the reality is that they don't, and you can lose a lot of information if you cannot distinguish the order of output.

jacereda commented 8 years ago

Remember we can install hooks to write(), we can preserve the ordering. The process wouldn't be aware it's being redirected, it would just perform normal write()s.

droundy commented 8 years ago

I just read up on redirecting stdout/err on Windows, and it looks like the only difference is that one uses HANDLEs rather than file descriptors (which are ints). So I don't see any reason we can't have a flexible library that supports redirecting on either OS to pipes or files of the caller's choice.

droundy commented 8 years ago

Using hooks to redirect the data sound cumbersome. You'd need to mirror the file descriptor table, and track changes across forks and calls to exec, as well as dup and friends. Given that any process may make one of these system calls directly and you'd mods it with LD_PRELOAD, this seems like a lot of fragility to add, for very little benefit over doing things the "normal" way.

On Sun, May 22, 2016, 1:18 PM jacereda notifications@github.com wrote:

Remember we can install hooks to write(), we can preserve the ordering. The process wouldn't be aware it's being redirected, it would just perform normal write()s.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-220853743

ndmitchell commented 8 years ago

Looking at the bigbro API:

I agree that the order of operations is potentially important. I don't think it's that big a deal to the API though - just have struct entry {int: mode; char* data} and have a single **entry pile for all the things that changed.
Knowing the difference between what happens on stdout and stderr is quite important. Why can't you just inherit stdout and stderr? Then people can redirect stderr/out already if they care?
Environment variables are important, but just an extra argument, nothing severe.
Blocking is very useful for me, but I guess a separate API, which takes a function which gets given an entry instead of a list of entry?
I suggest adding a flags argument, and initially supporting flags to turn on/off each entry, and a flag to nub the results, so you only get one the first entry per file.

droundy commented 8 years ago

I'm curious as to why order is important. Is it because you need to examine order for some reason, or to compensate for renames and deletions? If it's the latter, then I would prefer to embed the code that handles those issues so it doesn't need to be duplicated by every caller. Eg no reason to report to writes that are later deleted and no need to report renames of created files or directories at all. Just report the net effect on the filesystem. If we're going to return a sequence rather than sets, I'd like to see a real use case.

The issue with always inheriting street and stdout is that it effectively makes the code nonreentrant if the caller chooses to redirect stdout. True, we can let the caller create a lock to deal with that, but why do so when we can just redirect street and stdout after the fork? I never proposed to unconditionally redirect either.

I agree that adding an env argument is a good idea.

And yes, blocking absolutely needs to be a separate API, since there seem likely to be a severe performance penalty. Rather than giving a function argument, I'd consider returning early and having a resume function. But either way would work.

On Sun, May 22, 2016, 1:37 PM Neil Mitchell notifications@github.com wrote:

Looking at the bigbro API:

-

I agree that the order of operations is potentially important. I don't think it's that big a deal to the API though - just have struct entry {int: mode; char* data} and have a single **entry pile for all the things that changed.

Knowing the difference between what happens on stdout and stderr is quite important. Why can't you just inherit stdout and stderr? Then people can redirect stderr/out already if they care?

Environment variables are important, but just an extra argument, nothing severe.

Blocking is very useful for me, but I guess a separate API, which takes a function which gets given an entry instead of a list of entry?

I suggest adding a flags argument, and initially supporting flags to turn on/off each entry, and a flag to nub the results, so you only get one the first entry per file.

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-220854809

ndmitchell commented 8 years ago

I actually don't have any uses where order is important, beyond the callback, where order is certainly important but obvious anyway. However, it seems like a file system tracing thing might reasonably want that information. Certainly if I was debugging what a program did that information would be handy. I guess the difference is that I'd rather a complete trace, rather than the net effect - since from one you can compute the other, but not vice versa.

If you go for a resume call then you also need a cancel call, and any variables on the stack have to be copied into a separate buffer. Neither is fatal, but both seem like more work in the C side. However, my code is completely continuation passing, so in that respect capturing a continuation to continue would suit me a lot better. They are certainly equivalent, but the @droundy formulation can be converted to the @ndmitchell formulation very cheaply, but the other way round requires an extra callee thread, so I guess resume makes more sense.

droundy commented 8 years ago

I agree that strace is a wonderful debugging tool, and I truly pity platforms that don't have it, but am not thinking that this API is designed as a debugging API. As a library, I think the primary concern should be the ease of correct use by its intended audience. Specifically, in the debugging case you probably want to use an executable tool to trace, and probably want the terrace sent directly to stderr or stdout, where it can be correlated with the debug printfs you are already making.

The resume call wouldn't require a cancel call, because the function would return when it encounters a system call of interest. I have no particular interest in the blocking API, except in making it feasible and minimally introduce to the library. It will require an extra thread in the library. The resume option would make it possible for a caller to use the API without locking, which seems like good API design to me: put the necessary tricky stuff into the library rather than into every caller.

On Mon, May 23, 2016, 2:04 AM Neil Mitchell notifications@github.com wrote:

I actually don't have any uses where order is important, beyond the callback, where order is certainly important but obvious anyway. However, it seems like a file system tracing thing might reasonably want that information. Certainly if I was debugging what a program did that information would be handy. I guess the difference is that I'd rather a complete trace, rather than the net effect - since from one you can compute the other, but not vice versa.

If you go for a resume call then you also need a cancel call, and any variables on the stack have to be copied into a separate buffer. Neither is fatal, but both seem like more work in the C side. However, my code is completely continuation passing, so in that respect capturing a continuation to continue would suit me a lot better. They are certainly equivalent, but the @droundy https://github.com/droundy formulation can be converted to the @ndmitchell https://github.com/ndmitchell formulation very cheaply, but the other way round requires an extra callee thread, so I guess resume makes more sense.

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-220925569

droundy commented 8 years ago

It has now occurred to me that we have an additional challenge with the blocking API, which is that it can be blocked on multiple system calls simultaneously. Perhaps the API should just serialize them so that one system call is handled by the caller at a time. But it does add a little more excitement that I hadn't anticipated.

On Mon, May 23, 2016 at 6:26 AM David Roundy daveroundy@gmail.com wrote:

I agree that strace is a wonderful debugging tool, and I truly pity platforms that don't have it, but am not thinking that this API is designed as a debugging API. As a library, I think the primary concern should be the ease of correct use by its intended audience. Specifically, in the debugging case you probably want to use an executable tool to trace, and probably want the terrace sent directly to stderr or stdout, where it can be correlated with the debug printfs you are already making.

The resume call wouldn't require a cancel call, because the function would return when it encounters a system call of interest. I have no particular interest in the blocking API, except in making it feasible and minimally introduce to the library. It will require an extra thread in the library. The resume option would make it possible for a caller to use the API without locking, which seems like good API design to me: put the necessary tricky stuff into the library rather than into every caller.

On Mon, May 23, 2016, 2:04 AM Neil Mitchell notifications@github.com wrote:

I actually don't have any uses where order is important, beyond the callback, where order is certainly important but obvious anyway. However, it seems like a file system tracing thing might reasonably want that information. Certainly if I was debugging what a program did that information would be handy. I guess the difference is that I'd rather a complete trace, rather than the net effect - since from one you can compute the other, but not vice versa.

If you go for a resume call then you also need a cancel call, and any variables on the stack have to be copied into a separate buffer. Neither is fatal, but both seem like more work in the C side. However, my code is completely continuation passing, so in that respect capturing a continuation to continue would suit me a lot better. They are certainly equivalent, but the @droundy https://github.com/droundy formulation can be converted to the @ndmitchell https://github.com/ndmitchell formulation very cheaply, but the other way round requires an extra callee thread, so I guess resume makes more sense.

— You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-220925569

jacereda commented 8 years ago

I don't understand why we need a thread to implement the blocking stuff. Isn't a callback enough? The installed callback would attempt to generate the missing file and signal the traced process via a semaphore when done, at which point it would resume the interrupted file operation. Am I missing something?

ndmitchell commented 8 years ago

I think a callback is simpler. It's a C API - I expect it to segfault it if I upset it, so I don't see any problem with multiple simultaneous callbacks. I had envisaged pretty much what @jacereda seems to have been thinking of.

Regarding strace, if you wanted to write something like that on top of fsatrace (and I really believe you do!) then you'd just use the callback API, and thus get events in order. I still suspect that having a single buffer is easier than having one buffer per type, as it allows us to add new "codes" without changing the API (e.g. readdir in #12). Given a single buffer, and given that they have to be in some order, surely the order they were generated makes most sense? The information will not be harmful, and occasionally useful.

droundy commented 8 years ago

Okay, callback is fine for the blocking API.

I agree that future-proofing the API is wise in general. However, the codes do need to be semantic rather than having a 1-1 relationship with system calls if the API is to be useful, and in most cases I expect that if something was omitted, then our correct response is to add that to the existing codes. To use your example, if we omitted readdir, then adding it as an additional code is problematic, because users of the library may have assumed that any read should count as a read. If they assume that, then there is a bug in the existing code which isn't fixed by adding an additional code. This assumption might even seem logical, since users might know that you can open a directory with open(2), and might reasonably assume that any open for reading the filesystem is counts as a read. Similarly, if we failed to count execve as a read, users might be disappointed, if they assumed that all file dependencies are accounted for by the "read" output. Adding it as a separate code wouldn't fix the existing bug. In short, I think there is a good case for a need to change the API if we discover new file system events that require tracing. Of course, if you wanted to enable tracing of other events, e.g. network events, then that would be a different story entirely, but I don't see that as a direction that interests me.

Generally, my bias is in favor of making the API easy and safe to use, at the cost of extensibility, rather than the other way around. We can always introduce a new function later to add new behavior.

As I see it, there are only very few kinds of FS events to be tackled:

Writes: Write to file Creation of file Deletion of file Creation of directory Deletion of directory Modification of file or directory metadata Creation of symlink Modification of symlink Deletion of symlink

Reads: Read of symlink contents (happens when following a symlink) Read from file Read of file metadata (size, etc) Read from directory Read of directory metadata (size, etc)

Renames: File rename (can be viewed as deletion/creation) Directory rename (can be viewed as a lot of creations/deletions)

Unusual: Reflink (can be viewed as a read and a file creation) Creation of hard link (looks like a read and file creation)

Each one of these operations has to show up somewhere in our set of categories. If we don't have a special category for e.g. reflink, then we need to put it in our existing categories (once the system call shows up in a linux kernel we care about).

My preference is to define our categories in terms of causality. If an operation causes a future read from a file to change its output, then it must be a write to that file. Conversely, if a write to a file can cause an operation to change, that operation must be a read of that file. Directories are funny, in that writes to any file residing in a directory cause a read from the directory to change, but it would seem foolish to list each file operation as a write to its parent directory. This semantic distinction is why bigbro has just three categories. I wouldn't object to creating subcategories, but it is important to me (for fac) that these causality relationships be respected, which means that unless there is a shocking new development in file systems, any newly traced filesystem operations must fall in one of the existing categories. You could perhaps argue that extended attributes are an exception. I suppose we could allow introduction of new subcategories in a backwards-compatible manner, so maybe this rant is irrelevant.

My biggest issue (other than ease of use) with the order of output being chronological is that it places a constraint on implementation, which seems unwise. Of course, there is also the issue that it requires rewriting existing code, which lazy me doesn't want to do.

On Mon, May 23, 2016 at 12:37 PM Neil Mitchell notifications@github.com wrote:

I think a callback is simpler. It's a C API - I expect it to segfault it if I upset it, so I don't see any problem with multiple simultaneous callbacks. I had envisaged pretty much what @jacereda https://github.com/jacereda seems to have been thinking of.

Regarding strace, if you wanted to write something like that on top of fsatrace (and I really believe you do!) then you'd just use the callback API, and thus get events in order. I still suspect that having a single buffer is easier than having one buffer per type, as it allows us to add new "codes" without changing the API (e.g. readdir in #12 https://github.com/jacereda/fsatrace/issues/12). Given a single buffer, and given that they have to be in some order, surely the order they were generated makes most sense? The information will not be harmful, and occasionally useful.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-221072510

jacereda commented 8 years ago

So, I guess we have now a rough idea of what we need. The bit that scares me is the OSX implementation. For instance, besides the problem of not being able to hook system binaries due to SIP, I tried yesterday to detect directory reads and failed miserably.

I can intercept opendir(), but looks like ls is using the deprecated getdirentries() + some non-interceptable open_nocancel(). If we can't find a way to hook that we should probably start considering alternatives.

AFAIK, ptrace/dtrace are out of question for system binaries due to SIP.

With that in mind, I think the most robust approach would be a FUSE-based solution. This would be a good start: http://loggedfs.cvs.sourceforge.net/viewvc/loggedfs/loggedfs/src/loggedfs.cpp?revision=1.14&view=markup

@ndmitchell was against it because it would require installing additional software and in some scenarios that might be difficult, but the current DYLD_INSERT_LIBRARIES has too many drawbacks.

Perhaps I should invest some time prototyping a FUSE-based solution to try to measure how much overhead to expect from that...

droundy commented 8 years ago

Is it really true that dtrace isn't feasible on Mac? I had been told that it could be made to work, a year back... but that may have been before SIP. The Apple page I read just now makes it sound like it only prevents writing. I can see how that would prevent hooking, which is what malware wants to do, but i don't see why it would prevent tracing. :(

Fuse is definitely an inferior solution, although it may be necessary.

On Tue, May 24, 2016, 2:54 PM jacereda notifications@github.com wrote:

So, I guess we have now a rough idea of what we need. The bit that scares me is the OSX implementation. For instance, besides the problem of not being able to hook system binaries due to SIP, I tried yesterday to detect directory reads and failed miserably.

I can intercept opendir(), but looks like ls is using the deprecated getdirentries() + some non-interceptable open_nocancel(). If we can't find a way to hook that we should probably start considering alternatives.

AFAIK, ptrace/dtrace are out of question for system binaries due to SIP.

With that in mind, I think the most robust approach would be a FUSE-based solution. This would be a good start: http://loggedfs.cvs.sourceforge.net/viewvc/loggedfs/loggedfs/src/loggedfs.cpp?revision=1.14&view=markup

@ndmitchell https://github.com/ndmitchell was against it because it would require installing additional software and in some scenarios that might be difficult, but the current DYLD_INSERT_LIBRARIES has too many drawbacks.

Perhaps I should invest some time prototyping a FUSE-based solution to try to measure how much overhead to expect from that...

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-221411284

jacereda commented 8 years ago

The way I got dtruss to work was to copy the binaries out of system directories and run it as root. Other that that, you can disable SIP. Neither option seems very attractive.

droundy commented 8 years ago

Could dtrace just detect when the application enters the system binaries, and then we determine from that entry point what is going to happen? I speak as a dtrace ignoramus.

On Wed, May 25, 2016, 5:55 AM jacereda notifications@github.com wrote:

The way I got dtruss to work was to copy the binaries out of system directories and run it as root. Other that that, you can disable SIP. Neither option seems very attractive.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-221566885

ndmitchell commented 8 years ago

For order of events, its certainly going to be observable which order they are returned to the C user, so it makes sense (to me at least) to define that order. And the only order which "makes sense" is the order of time.

For Fuse (or indeed anything) the question is will it work out the box, and what configuration does it require. If it works out of the box with no configuration as a single binary/dll, that's awesome. If that's not feasible or requires a tweak, that cuts down on the number of users (every additional step does), but it's not fatal.

droundy commented 8 years ago

After looking into SIP and dtrace a bit, I wonder if making a copy of the system directories and using chroot might be a better choice than FUSE. You'd only need one copy, and could presumably save it from one invocation to another. It's an ugly hack, but it sounds like Apple is doing its best to prevent the kind of behavior we are hoping to engage in.

jacereda commented 8 years ago

I have started the FUSE implementation, looks like the performance will be acceptable. Building fsatrace itself takes 0.66 seconds untraced and 0.71 when traced.

This is the way I think it should work:

The FS daemon is launched mounting on top of the source directory.
Build commands are invoked normally, all file operations will go to this 'overlay' filesystem.
The FS daemon exposes resulting operations from a certain PID and all its children at a special path, say, .ops<PID>.
The memory associated with those .ops files will be kept in a circular buffer, so, only the operations for the N most recent top-level processes are available at a certain point.

What previously required invoking the fsatrace program would now just launch the process normally and read back its operations from its .ops file.

Do you think we'll need to track accesses out of the sources directory?

Does this sound reasonable?

jacereda commented 8 years ago

@droundy the problem is that I'm afraid Apple isn't the only system that will implement policies like those at some point, so I think going for a general solution (and FUSE is) should be better in the long run. I certainly prefer a FUSE-based solution to having two copies of all the system binaries floating around. Besides, at some point they might decide they also want to enable SIP for, say, /Applications/Xcode.app and then we'd also need to replicate that.

droundy commented 8 years ago

There are a couple of problems with FUSE, although it is what tup uses, so obviously it is possible to use it.

The first is permissions. Mounting a FUSE filesystem requires special permissions. Typically on Linux this involves the user being in a fuse group, which is checked by an suid root binary. This gives several challenging failure modes. Obviously it means your user needs to be in that group. Maybe on Mac OS that is always guaranteed? Secondly, suid root binaries need to be permitted. For me this was problematic because I use NFS with rootsquash, which meant that and directory that was not world readable could not be used with tup. But that error messages were of course far from clear.

It is definitely nicer to be able to track all accesses, but that is not a show stopper, since users can run clean if they update their compiler or install a library.

The final question is whether it is possible to even define a "top level process ID" as you propose, let alone discover which one corresponds to a given process. I suppose you can define a top level process as a child of your main process?

On Mon, May 30, 2016, 2:44 PM jacereda notifications@github.com wrote:

@droundy https://github.com/droundy the problem is that I'm afraid Apple isn't the only system that will implement policies like those at some point, so I think going for a general solution (and FUSE is) should be better in the long run. I certainly prefer a FUSE-based solution to having two copies of all the system binaries floating around. Besides, at some point they might decide they also want to enable SIP for, say, /Applications/Xcode.app and then we'd also need to replicate that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-222557957, or mute the thread https://github.com/notifications/unsubscribe/AAIZKVzvhkCjD-X3_D6-y5zMfbdEAlB5ks5qG1o1gaJpZM4Ij_su .

jacereda commented 8 years ago

I think the FUSE approach is starting to look quite promising.

On Mac OS the experience is smoother. No 'fuse' group. No suid executables. I'll try to test on linux at some point, maybe there're ways to relax those requirements.

Finding the top-level pid would be something like:

static int
calc_toplevel_pid(int pid) {
    int ppid = calc_ppid(pid);
    if (ppid == s_root)
        return pid;
    if (ppid <= 1)
        return 1;
    return calc_toplevel_pid(pid);
}

So, the problem is now telling the FS who is the root process (the build tool). It could be the process who invoked the FS daemon, whatever you tell it via command line, or whatever you write to a special file entry (say, /.root).

I'm trying to decide whether to let it run as a daemon continuously or invoke/kill it explicitly when the build starts/stops. What do you think?

Also, no code is shared with fsatrace/bigbro, so I'm trying to figure out a good name for the new project. Any suggestion?

droundy commented 8 years ago

I think you would want to invoke and kill the fuse when the build starts or stops. This would require a new API, since most builders will want to run multiple jobs simultaneously. Tup's approach is to put the fuse mount in a special subdirectory, so there is one mount per job that is running (thus avoiding the need to track top-level PIDs), but that breaks quite a number of build tools, so it's not optimal. Tup gets around that by using chroot if tup itself is suid root, but of course that is another security rabbit hole.

I would certainly not recommend using fuse on linux. I'm pretty certain that there is no way around the security issues.

I'm not sure whether we'll end up with one cross-platform library or not, but adding yet another project for another platform seems a bit silly. I would just add the code to fsatrace if I were you. Shared code isn't particularly important, way less important than a shared API. I've been looking at adding some fsatrace code (with appropriate copyright headers) into bigbro, to start the port to windows. So far it can run a process sans tracing, which isn't much, but is something.

On Thu, Jun 2, 2016 at 9:56 AM jacereda notifications@github.com wrote:

I think the FUSE approach is starting to look quite promising.

On Mac OS the experience is smoother. No 'fuse' group. No suid executables. I'll try to test on linux at some point, maybe there're ways to relax those requirements.

Finding the top-level pid would be something like:

static int calc_toplevel_pid(int pid) { int ppid = calc_ppid(pid); if (ppid == s_root) return pid; if (ppid <= 1) return 1; return calc_toplevel_pid(pid); }

So, the problem is now telling the FS who is the root process (the build tool). It could be the process who invoked the FS daemon, whatever you tell it via command line, or whatever you write to a special file entry (say, /.root).

I'm trying to decide whether to let it run as a daemon continuously or invoke/kill it explicitly when the build starts/stops. What do you think?

Also, no code is shared with fsatrace/bigbro, so I'm trying to figure out a good name for the new project. Any suggestion?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-223353953, or mute the thread https://github.com/notifications/unsubscribe/AAIZKVeQzujsbT4-vlvE8M2DEILKfgUvks5qHwtNgaJpZM4Ij_su .

jacereda commented 8 years ago

The problem is that embedding it in fsatrace is just too much work. It would require implementing the C API and I don't see a clear benefit. The FUSE approach could work without any API at all, since you only need read()/write() to communicate with it.

AFAIK it would even work on Windows via https://github.com/dokan-dev/dokany and would be far more robust across platforms.

jacereda commented 8 years ago

I've setup a new repo at https://github.com/jacereda/traced-fs

Should compile on Linux and Mac OS so far.

ndmitchell commented 8 years ago

To somewhat echo @droundy's point, I don't really care what the underlying mechanism is, but I do want something that presents the same interface on all platforms (so I can use it abstractly while developing the upstream code on only one platform). My experience with fsatrace on Linux is that it requires privileges that mean it can't be tested on Travis, which for me means I can't effectively test it. My guess is that on Windows such mechanisms are pretty scary, because usually when such Linux things are shoehorned in they tend to be, but I trust other people to make the call here.

jacereda commented 8 years ago

Looks like fuse filesystems can be tested on Travis:

https://github.com/mpl/camlistore/blob/master/.travis.yml

As for Windows, I'm trying to setup a VM to figure out how it goes there.

droundy commented 8 years ago

A quick look through the dokany documentation suggests that you can't mount a dokany file system at arbitrary mount points, like you can with fuse, which seems likely to be highly problematic. But maybe there is a way around that.

On Fri, Jun 3, 2016 at 8:19 AM jacereda notifications@github.com wrote:

Looks like fuse filesystems can be tested on Travis:

https://github.com/mpl/camlistore/blob/master/.travis.yml

As for Windows, I'm trying to setup a VM to figure out how it goes there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-223608396, or mute the thread https://github.com/notifications/unsubscribe/AAIZKXox1z6mvHW_6dQaHsrLyU-ZzhqYks5qIEYKgaJpZM4Ij_su .

jacereda commented 8 years ago

I don't think it would be problematic in my scenarios. To trace an operation, traced-fs would mount a T: drive that mirrors C: and the build system would need to switch to that unit prior to launching the commands. It could certainly be a bit more painful if the build requires files from different drives.

droundy commented 8 years ago

I see. The only issue I see with that is that it is likely to break debugging tools that record the path to the source code files. You could, of course, keep the T: drive semi-permanently mounted, but that seems like it could be a bit more of a pain.

On Fri, Jun 3, 2016 at 10:02 AM jacereda notifications@github.com wrote:

I don't think it would be problematic in my scenarios. To trace an operation, traced-fs would mount a T: drive that mirrors C: and the build system would need to switch to that unit prior to launching the commands. It could certainly be a bit more painful if the build requires files from different drives.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-223635056, or mute the thread https://github.com/notifications/unsubscribe/AAIZKcxYRC3FWDbs6PkZD728LgT5Wz_-ks5qIF4dgaJpZM4Ij_su .

jacereda commented 8 years ago

Well, maybe having it permanently mounted is desirable keeping in mind that mounting and unmounting will probably hurt caching.

jacereda commented 8 years ago

@droundy Could you make a quick test with your NFS setup? Something like this would suffice:

make
./fs &
ls -l traced/<absolute-path-to-some-file-in-your-nfs-volume> &
cat traced/.ops/$!

droundy commented 8 years ago

$ cat traced/opt/$! cat: traced/opt/23228: No such file or directory

On Fri, Jun 3, 2016 at 11:18 AM jacereda notifications@github.com wrote:

@droundy https://github.com/droundy Could you make a quick test with your NFS setup? Something like this would suffice:

make ./fs & ls -l traced/ ls -l traced/.ops & cat traced/.ops/$!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-223651990, or mute the thread https://github.com/notifications/unsubscribe/AAIZKVx70fT5bpDiR2W1S7bfG4OwySKGks5qIG4OgaJpZM4Ij_su .

jacereda commented 8 years ago

Sorry, I edited the command sequence afterwards, can you recheck?

droundy commented 8 years ago

Still getting no such file or directory.

On Fri, Jun 3, 2016 at 4:27 PM jacereda notifications@github.com wrote:

Sorry, I edited the command sequence afterwards, can you recheck?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jacereda/fsatrace/issues/14#issuecomment-223717958, or mute the thread https://github.com/notifications/unsubscribe/AAIZKQZ_gxoHivDoW-X9ISZvQcgdHakGks5qILhagaJpZM4Ij_su .

jacereda commented 8 years ago

Try this:

killall fs &
ls -l traced/<absolute-path-to-some-file-in-your-nfs-volume> &
cat traced/.ops/$!

The killall will fail if some process is running inside the traced directory, so make sure you don't have a bash running inside. Notice it's .ops, not opt.. Also, make sure the ls is executed in the background (&).

droundy commented 8 years ago

bennet:traced-fs$ killall fs bennet:traced-fs$ ./fs & [2] 11067 [1] Done ./fs bennet:traced-fs$ ls -l traced/home/droundy/.tmp/traced-fs/ & [3] 11071 bennet:traced-fs$ total 100 -rwxr-xr-x 1 droundy users 24168 Jun 3 16:22 fs -rw-r--r-- 1 droundy users 23377 Jun 3 16:21 fs.c -rwxr-xr-x 1 droundy users 44920 Jun 3 16:22 fsd -rw-r--r-- 1 droundy users 209 Jun 3 16:21 Makefile drwxr-xr-x 24 root root 4096 May 24 10:35 traced

jacereda commented 8 years ago

Good, seems to work properly. Thanks.

jacereda commented 8 years ago

After fixing a bug in utimens handling, I can trace a stack build. That was failing miserably with fsatrace.

jacereda commented 8 years ago

If this happens at some point I guess it could be an alternative to dokany:

https://wpdev.uservoice.com/forums/266908-command-prompt-console-bash-on-ubuntu-on-windo/suggestions/13522845-add-fuse-filesystem-in-userspace-support-in-wsl

jacereda commented 7 years ago

I've been reconsidering the traced-fs implementation for Windows. I have a prototype using dokany but I think it would be just easier and more stable to write a minifilter driver.

Having to install a driver sucks, but dokany also installs a driver, so it would be at the same "suckiness" level.

jacereda / fsatrace

Add C API #14

I agree that the order of operations is potentially important. I don't think it's that big a deal to the API though - just have struct entry {int: mode; char* data} and have a single **entry pile for all the things that changed.

Knowing the difference between what happens on stdout and stderr is quite important. Why can't you just inherit stdout and stderr? Then people can redirect stderr/out already if they care?

Environment variables are important, but just an extra argument, nothing severe.

Blocking is very useful for me, but I guess a separate API, which takes a function which gets given an entry instead of a list of entry?