guard / rb-fsevent

FSEvents API with signals handled (without RubyCocoa)
https://rubygems.org/gems/rb-fsevent
MIT License
245 stars 37 forks source link

migrate to using FFI and consolidate under the Guard/Listen banner #71

Open ttilley opened 7 years ago

ttilley commented 7 years ago

It's somewhat awkward to completely outline a plan for your own library in a comment under somebody else's, but apparently that's how I roll. :man_shrugging:

My stream of consciousness for this is outlined in guard/rb-inotify#66 at https://github.com/guard/rb-inotify/issues/66#issuecomment-303563777

ttilley commented 7 years ago

special ping to @thibaudgg as I would like your feedback to that comment specifically before proceeding.

EDIT: I need your input on this before I can do anything at all. Like I mentioned in the referenced comment, I come and go from this project in an unpredictable manner. This makes it less than wise for me to have the final say in long-term vision. The comment outlines my best advice but you need to have the final say. @ioquatix appears to be overqualified to have an opinion here if you'd rather defer to him instead.

ioquatix commented 7 years ago

I'm also a mac developer with C/C++ experience so let me know if I can help.

Also, I think initially, FFI was only a suggestion, not a hard requirement. In the end - it's really what makes it easier and better - the internal design is up to you.

If it makes sense (and less breaking) to just keep using the binary approach, I'm okay with that.

Is it possible to use FFI to access the appropriate APIs without needing a compiled library?

ttilley commented 7 years ago

rb-fsevent has been plagued on and off with tricky bugs involving the passing of signals to the fsevent_watch subprocess. At one point a change in the behavior of guard itself necessitated rewriting signal handling in rb-fsevent to not break things.

There is also a bug where sometimes, for no sane reason, the fsevent_watch process is hanging around after the parent process is long gone. For people who experience this bug and also never turn off their machines (instead putting them to sleep)... eventually the lower level fsevents service will start rejecting requests to watch the filesystem. And I don't mean just from rb-fsevent. I am unable to reproduce this bug and thus have little motivation to dig deeper.

The communication format between parent and child process isn't nearly as well thought out as I thought it was at first. I had a plan at one point to move to otnetstring (an optimized variant of tagged netstrings that makes much more sense to me and is very easy to work with), but then I got distracted by other life events. Anyways, there is also a bug where if your file path looks sufficiently similar to the syntax I use for passing messages, then those events never make it to the library user.

All of the above bugs go away if I'm not using a subprocess. The biggest concern for me is the horrible horrible things we have to do to work around an HFS+ caching bug in 10.7-10.11 and having that occur in the main ruby process rather than isolated in the subprocess.

It is not, however, possible to use FFI to access the appropriate APIs without needing a compiled library. It is also not possible to perform the necessary workarounds without a compiled library. I don't see why this is an issue because I've been compiling (and signing with my personal Developer ID certificate) fsevent_watch for users ahead of time for the entirety of my contribution to this library and provide rake tasks to recompile if you're running an older system than i'm able to target (the underlying code still supports 10.6 if I was too lazy to clean it out and let me tell you... I'm lazy as fuck when I can get away with it).

ttilley commented 7 years ago

Before offering assistance, you should look at the depth of the cthulu insanity level BS happening here: https://github.com/thibaudgg/rb-fsevent/blob/master/ext/fsevent_watch/FSEventsFix.c ;)

Ignore the initial comment block, it's from an earlier attempt and not in any way accurate. The custom realpath uses CoreFoundation.

...actually, you might want to read that code anyways just out of perverse curiosity.

ioquatix commented 7 years ago

Uh, is this for real?

Like, how does apple even use their own API if it's that broken?

ttilley commented 7 years ago

It's a VERY tricky bug, and sometimes just looking at it the wrong way makes it disappear. Describing it as an HFS+ caching bug is actually a bit simplistic as it's a series of bugs at multiple levels that have to happen at the same time for the issue to occur. I wouldn't go so far as to say it results in filesystem corruption... but it does result in completely faulty metadata in the filesystem so it's pretty close.

The bug does not exist in 10.6, and was fixed in 10.12 after years of back and forth with Apple. It took that long to devise a way to reliably reproduce it and test for it... and the tech was absolutely horrified to discover that the problem existed somewhere on the filesystem of almost every machine they tested.

ttilley commented 7 years ago

lol @thibaudgg... I need more than a thumbs up emoji man. 😆

ttilley commented 7 years ago

Another thing to think about, regarding that wiki page... There are details we were very politely asked not to disclose. There's no gag order or NDA, but you most certainly don't bite the hand that feeds you. I believe that describing it as a metadata issue is probably sufficient... with a footnote that the kernel, realpath(), and various other APIs (both POSIX and Apple) would all disagree about the name of a path and that this is the root of the problem. The kernel would report an event using a path that the userspace was not looking for, so the event wouldn't propagate unless the software was explicitly watching all paths (like Apple's Time Machine backup software, which miraculously continued to work despite the bug, but would result in backups that disagree with the original regarding the case/name of any number of files and directories).

ttilley commented 7 years ago

I'm also thinking --my-filesystem-is-broken is probably sufficient vinegar for enabling those hacks. It stresses the importance of upgrading to 10.12 without being a dick about it.

ioquatix commented 7 years ago

So you are saying that Time Machine is basically broken for 10.6 .. 10.11?

ttilley commented 7 years ago

Calling it broken is a bit of an exaggeration considering that the filesystem is case preserving but case insensitive. No, instead the backup would preserve an incorrect form of the path... and restoring from backup wouldn't cause any problems because, again, the filesystem is case insensitive. Nothing would stop working or work differently outside of filesystem monitoring. It was, essentially, an invisible bug that broke nothing... another reason why nobody ever noticed.

The issue existed between 10.7 and 10.11. 10.6 and 10.12 function as intended.

The bug also doesn't exist in any way whatsoever if you're using the (non-default) case sensitive version of HFS+. This is why I use case sensitive HFS+ even now... (though I have to have Steam, Photoshop, and other braindead software installed to a case insensitive disk image since they break completely if your filesystem is case sensitive... it'll be hilarious to see how they react to APFS, which is ONLY case sensitive right now, with plans to potentially add case insensitivity later)

ioquatix commented 7 years ago

I use case sensitive HFS+ on all my machines :D I refuse to install crap which doesn't work on case sensitive machines :D

thibaudgg commented 7 years ago

lol @thibaudgg... I need more than a thumbs up emoji man. 😆

Yeah, sure, sorry about that, yesterday was a bit packed for me.

I completely agree with your comment about leaving rb-fsevent untouched. My initial idea was also just to increase the gem version and add a simple post_install_message explaining the move to listen-fsevents. As you said a lot of people depends on the current state of the gem, and I see no reason either to change it. That move also gives us (essentially you!) more liberty to take drastic choices around listen-fsevents. I would personally rather have a simpler gem that only works on 10.12, uses FFI, supports only latest Ruby version (if useful for some technical reasons), and drops hackish pieces of baggage (even if you worked hard on them 💪).

If rb-fsevent and listen-fsevents gems co-exist, listen could easily use one or the other based on the macOS version and notify the user about his/her broken file systems. The new "untouched" rb-fsevent version would also inform the user about the same thing and encourage the upgrade to 10.12 and listen-fsevents.

On the technical decisions (using FFI or not, ...) around listen-fsevents it's really up to you, @ttilley, to decide. This gem is definitely yours since version ~0.4. Thank you a lot for all your hard work, it's really really appreciated 💟.

ttilley commented 7 years ago

@thibaudgg ok. I'll start out without hacks, at least... especially since the latest xcode appears to only include the 10.12 SDK anyways!

Aya:~ ttilley$ ls -l /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/
total 4
drwxr-xr-x 4 root wheel 170 May  3 18:36 MacOSX.sdk
lrwxr-xr-x 1 root wheel  10 Apr 14 20:46 MacOSX10.12.sdk -> MacOSX.sdk

...so unless I'm missing something, I'm currently unable to target older versions of MacOS if I go the pre-compiled route for the dylib files (even though the "deployment" option still exists and goes back to 10.6, this is misleading without an older SDK available).

10.12 only it is I guess.

ioquatix commented 7 years ago

@ttilley Replying specifically to your comment about SDK versions.

I ran into the same problem when compiling things using Teapot.

I settled on using the "MacOSX.sdk" rather than a versioned one.

That SDK supports older versions of Mac OS X, you just need to run clang++ -mmacosx-version-min=10.6 ... and it will ensure that you don't use parts of the SDK that are not available to you.

ttilley commented 7 years ago

...which gives me even more reason to maintain rb-fsevent, as at least one of the two should contain warning about filesystem metadata corruption. I'll just have to use TravisCI or something to build those binaries. Right now we silently work around the issue without ever informing the user there might be a problem. That's not at all the right approach, in hindsight, and the decision to do so was made hastily in the excitement of finally finding a solution.

I'll get properly started on friday evening or over the weekend and build from scratch to get rid of the cruft not necessary on 10.12. It'll be liberating not to care about compatibility since 10.12 works beautifully and the API has been stable for more than one major OS release at this point. Improvements have also been made to event coalescing as well so as to avoid overloading an app using file level events (initially, just renaming a file with file level events could fire as many as 10 or more separate events). I have already stated my reasons for wanting to defer to @thibaudgg on this decision and stick by them no matter what he says about my contribution to the project. 😝

I do have a lady friend I'm hoping to have over friday afternoon for my famous homemade calzones with my own personal potato dough recipe, glorious bacon, and a mixture of bell and banana peppers fried in the bacon grease. So absolutely nothing before late evening... I hope. Wish me luck! :wink:

ttilley commented 7 years ago

@ioquatix it'll ensure you don't use parts of the SDK not available to you, but if you pre-compile (which has been my strategy thus far) it still won't run on older releases.

...unless something has changed since I last cared enough to pay attention.

ioquatix commented 7 years ago

I was under the impression if you specify that flag, it will run on older releases of mac os x. But, perhaps there is another flag you need to specify. I'd need to look it up. In any case, we can test this.

ttilley commented 7 years ago

@ioquatix - I'm thankful to have you on the job, and truly look forward to working together. #brofist

ioquatix commented 7 years ago

https://www.youtube.com/watch?v=5pEPpNpbnCI

ioquatix commented 7 years ago

So cheesy 🧀

ttilley commented 7 years ago

-_-

ioquatix commented 7 years ago

I don't mean #brofist is cheesy - the lyrics of the video are cheesy - I'm totally down with #brofists :D

ttilley commented 7 years ago

I've sent some casual thoughts to @ioquatix. This ticket will be updated if they become more than just fun ideas.

ttilley commented 7 years ago

...is there a sane way for FFI to handle C blocks as callbacks rather than function pointers?

thibaudgg commented 7 years ago

@ttilley no idea I never used FFI, sorry that I cannot help you more.

ttilley commented 7 years ago

If you decide to install 10.13 to play with APFS properly... don't. Older machines can't boot from an APFS root even though there's an EFI driver for it embedded into the filesystem (slightly overkill?) and if you repartition to have an HFS+ root and APFS /Users and etc (you can have multiple volumes on the same filesystem) it completely breaks both migration assistant and time machine.

Oh yeah, and 10.13 doesn't wake up from sleep.

thibaudgg commented 7 years ago

@ttilley thanks for the info, I'll wait a little bit longer then. 👍

ttilley commented 7 years ago

I have no idea what I'm doing and providing a C callback via FFI to swift code is painful as fuck. Even if it was ObjC or C code, it's still kinda painful.

SO I'm going to procrastinate further and install the 10.13 beta2 to play with because that's how I roll.

At least I fixed most of the existing rb-fsevent bugs and released v0.10.2. It has most of the benefits of an FFI release, except for ignoring (or marking) events caused by the current process and properly handling when the watched folder has its name (or a parent directory's name) changed. And it seems that the issue with processes lasting longer than they should has gone away after the rebuild... and the change of communication format fixed the edge cases with file names with unexpected characters being ignored (as well as some other issues). Pretty much everything I'm aware of is fixed, and the API has been enriched to allow for additional possibilities previously not possible (like being aware of a file's type ahead of time via metadata without having to check the filesystem).

@ioquatix - with the 10.12 SDK what I thought of previously regarding compatibility is completely false. You CAN build against the 10.12 SDK and have it loosely link or use new APIs. This wasn't the case originally. I don't know exactly when the binary compatibility change was made. It HAD to be after garbage collection support was dropped completely though, not just when it was deprecated. :shrug:

ttilley commented 7 years ago

aside: it might be easier to rewrite and extend the existing fsevent_watch C code to be usable as a library, and thus from FFI, rather than writing anything new.

ms-ati commented 6 years ago

Hi all, thanks!!! for all the fantastic work here.

I found my way over here from https://github.com/codekitchen/dinghy/issues/269, since rb-fsevent is used by dinghy to proxy file event change events into the VirtualBox VM in which Docker is running.

And since we have a candidate fix already for unfs3 to use nanosecond timestamps on 10.13 APFS, I was wondering if you thought that rb-fsevent will also need an update for nanosecond timestamps there?

Thanks again!

kevintom commented 3 years ago

this is a very old thread but regarding the discussion around dropping historic support on older (read: broken) versions of macOS,

it seems like the M1 transition is a very good/easy place to make the jump.

All M1/arm versions of the code require at least version 11, so that seems like a fairly easy safe way to make the drastic changes and not worry about historic compatibility since you will be sure anyone using an M1 won't be an older version.

even if you decide to leave legacy Intel support in, you could make a fairly drastic change to the binary on the M1 version and just lipo the two together such that what gets distributed works consistently on historic Intel systems but the new M1 stuff would not be bound by the same constraints. (see this PR for how I lipo two binaries together https://github.com/thibaudgg/rb-fsevent/pull/88)

thibaudgg commented 3 years ago

I merged https://github.com/thibaudgg/rb-fsevent/pull/88 in (thanks @kevintom!), I'm still open to move this gem under the Guard organisation alongside the listen gem if that helps to centralize community effort as I don't have much time anymore to work on it. /cc @rymai

ioquatix commented 3 years ago

Yes please do.

thibaudgg commented 3 years ago

@ioquatix done, the gem is now hosted under the Guard organisation: https://github.com/guard/rb-fsevent

ioquatix commented 3 years ago

Thanks!