ndmitchell / shake

Shake build system
http://shakebuild.com
Other
772 stars 118 forks source link

intermediate selector between childOf and descendantOf #355

Open ezyang opened 8 years ago

ezyang commented 8 years ago

My project has an &%> (AND pattern) rule. Internally, it seems Shakes decomposes this into several rules, e.g. if I have ["*.o", "*.hi"], I get a rule "Foo.hi Foo.o" as well as a rule "Foo.hi" and "Foo.o" (which are the immediate children.)

OK, so I'm in the rule table view, and I want to look at the direct dependencies of the slowest rule, which happens to be "Foo.hi Foo.o". I say childOf(slowestRule())... and I get a very unhelpful table with "Foo.hi" and "Foo.o". No no no, I don't want that.

But there's no convenient way, in the current selector language, to specify this.

I'm not sure using the selectors to build booleans is the right model. I'd prefer something more functional/relational, where what is returned is the set of rules you've selected.

ndmitchell commented 8 years ago

So the decomposition is hopefully going away at some point: https://github.com/ndmitchell/shake/issues/192#issuecomment-88927977

Can you describe a hypothetical syntax for what you might want to write? The current system is powerful and confusing, something powerful and sensible seems good.

ndmitchell commented 8 years ago

And for having child of child, there's not much scope there in the current implementation, I figure out dependencies and transitive dependencies early on at precomputation time, so doing 2 children away is expensive.

ezyang commented 8 years ago

Well, if decomposition goes away, then my problem will be solved.

As for the language, I won't claim to know what's right for Shake. The current language has a lot of resemblance to Perl, in that it relies on implicit state (the name of the rule, the group, the colors). This means things can be kind of inscrutable... but you can write queries quickly and compactly.

What I meant by a "functional" interface was to say, "The full set of rows is the return of table(); you can filter(p, set), you can map(f), etc." Maybe even make it point-free so you don't have to refer to table() explicitly. Example: instead of, name(/(\.[a-zA-Z0-9_]+)$/, "???"), you would have map(group(/(\.[a-zA-Z0-9_]+)$/, "???"), table()), which means, "For each row in table(), replace its name with the first group of the regular expression \.[a-zA-Z0-9_]+)$/, or ??? if it doesn't match. The expression is literally what you run to then get something to give to the plotter/table writer. Obviously these will be longer.

If you want to support generalized queries, like two children away, then you probably should embed something like Datalog. Then it would be very easy to express a query like, "parent of the parent". It would even be halfway efficient too!

ndmitchell commented 8 years ago

Hmm. Embedding Datalog seems like too much effort, unless there's a pre-rolled Javascript library that does most of it for me. For the group/map/filter thing I wonder if it's just a case of returning a boolean and joining with && instead of returning a Maybe String and composing with something like >>=. If so, perhaps having something isomorphic but more functional/less-state might get me some of the way there. I'm still trying to figure out what the true C J Date style relational-algebra formulation would be - I'll talk to our experts at work.