timbertson / gup

A better make, inspired by djb's redo.
GNU Lesser General Public License v2.1
51 stars 5 forks source link

Named pattern in Gupfile #2

Open np opened 10 years ago

np commented 10 years ago

Concretly I would like to name parts of the patterns in the Gupfile these variables should then be accessible through build script, using environment variables for instance.

Here is a mockup example:

build-html:
  %(base:*).html
build-archive:
  %(dir:*)/%(base:*).%(ext:tar.gz)
  %(dir:*)/%(base:*).%(ext:tar.xz)

If one builds foo/bar.html, in build-html one has access to GUP_base=foo/bar. If one builds foo/bar.tar.gz, in build-archive one has access to GUP_dir=foo, GUP_base=bar, and GUP_ext=tar.gz.

What do you think?

timbertson commented 10 years ago

I like the idea, certainly. I'm worried about syntax though. I want to keep the gupfile syntax to a minimum.

I'm wondering if gupfiles will also want regex syntax, in which case we could perhaps combine the two (while giving even more control about what gets matched). Since there's no good reason to start a path with / (they should all be relative), we could use python's named capture groups for this:

build-html:
  /(?P<base>[^/]+)\.html
build-archive:
  /(?P<dir>[^/]+)/(?P<base>[^/.]+)\.(?P<ext>tar\.gz)

It's much less readable, but it is a well-established syntax (and also handles the case of more specific wildcards than * and **). I've already wanted support for matching restricted patterns like step[0-9]+.html.

It's kind of a pain if you just want to capture some part of the path and don't want to convert the rest of your pattern to a regex, but at the same time I don't want to invent non-trivial rules about what needs escaping in the existing (plain) path syntax. WDYT, is forcing regexes reasonable for this use case?

np commented 10 years ago

I think we need two modes something "stupid simple" for the normal cases and these named capture for the hard cases.

The most simple case would be %(var) or just % for a special default variable. They should just capture as much as * as it is the default sane behavior.

timbertson commented 10 years ago

But I don't think named capture is sufficient - I thing regexp mode is needed as well (e.g match step1.html, step2.html and step3.html but not steps.html). So the question is: do we add a third mode to deal with named capture groups, or just roll it into the regexp mode?

np commented 10 years ago

Yes what I meant is: I think we need the full regexp mode. However since it is going to be ugly for the common cases I suggest adding % and %(var) as part of the default "globbing" mode.

timbertson commented 10 years ago

ugh, damn - sorry about the noise, I got the issue number wrong. The above commit fixes #4, not this issue.

xparq commented 8 years ago

+1 for simplified globbing syntax for the most common cases. But I'm also worried about the extra syntax clutter... Please come up with an ingeniously intuitive, light yet sufficiently general approach which can handle all the cases! ;)

timbertson commented 8 years ago

I'm thinking there should be a per-line pattern mode flag. It doesn't matter if it's a bit wordy, since you shouldn't need to write them that often. Something like:

builder.sh: /lit foobar /glob foo// /re ^foo/bar/baz /group %(base:*).html

I also like =lit, =re etc. / is good because it's not going to be present in a filename, but a leading ":" or "=" should be almost as rare (and they both work better as a mnemonic).

Thoughts?

xparq commented 8 years ago

I'll need to actually start using gup first to have a feeling for that... :) Till then, please verify that no such light syntax exist that can handle the different cases without an explicit prefix. I'd be a bit reluctant to add those new flag "keywords" to the language.