Support for case transformations in placeholders

tommcdo commented 10 years ago

In some of my projects, I have lowercase filenames mapped to mixed case class names. So, for example, the class Foo_Bar is defined in classes/foo/bar.php.

In addition to the placeholders like %u, it would be nice to have a corresponding placeholder like %U that would transform foo/bar into Foo_Bar.

tpope commented 10 years ago

Let's use this issue as a place to start collecting real world transformation requirements. I don't want to dig the current hole too deep as I suspect there are more variants than the current one character expansion system can handle.

qstrahl commented 10 years ago

I think a good start would be getting all the cases handled in abolish's coercion -- preferably with the relevant characters matching up.

tpope commented 10 years ago

Emphasis on real world.

tommcdo commented 10 years ago

Maybe some kind of modifier to the existing ones would make more sense. The opposite case to my example may be necessary, so Foo/Bar needs to be foo_bar -- in this scenario, %u doesn't seem right, as %u should be the un-case-transformed variety.

I can see 3 case-transformation modifiers, as @qstrahl sort of mentioned: lower case, UPPER CASE, and Mixed Case. I'm having a hard time coming up with 3 modifier characters that seem related and intuitive, but something like %mu could mean Mixed_Case, where %Cd could mean mean UPPER.CASE. Are letters good modifiers?

tommcdo commented 10 years ago

Head and tail may also be necessary. Real-world example: PHP namespace and classname. So foo/bar/baz may be for code like this:

namespace foo\bar;
class Baz { ... }

(And this also leads to converting / to \. Maybe a generic "change / to {char}" is needed?)

tpope commented 10 years ago

Java: / to . (but would we actually need to do this?)
Clojure: namespaces change / to . and _ to -.
Ruby: MixedCase, ::. / to - is a gem naming convention but I can't think of a reason projectile would care.
Perl: :: but no case transformation, as I recall.
Rails: pluralization, singularization. / to _ for table names. Plus the ruby stuff, but I can't recall needing to stack those with these.
Cucumber: _ to , capitalize first letter for feature title.
VimL: / to _ (for include guards) and to # (for function names). Camelization would be helpful for GetFooIndent()

tommcdo commented 10 years ago

I think there should be support for applying an arbitrary number of transformations to a placeholder, and this may even eliminate the need for %u and %d (or they can exist as synonyms).

A categorized summary of transformations:

Case-transformations: upper, lower, mixed, camel
Delimiter substitutions: ., _, ::, \, -, #, delimiter removal
Path components: head (leading directories) and tail (filename)
Inflection: singular, plural

Maybe a pipeline style syntax would be good, after giving each transformation a name: %{s|upper|colons|plural|head}, given foo/bar/baz would yield FOO::BARS.

EDIT: I didn't mean to make an implicit requirement in there. I put plural before head and the result still came out pluralized, but supporting that would over-complicate the code. Linearly applying transformations in the order they're listed would be totally sufficient.

EDIT 2: I think %{upper|colons|plural|head}s looks nicer.

tommcdo commented 10 years ago

Just thought of a potential advantage to this arbitrary pipeline style. Functions for placeholder transformations can be defined in some namespace, e.g. projectile#placeholder#upper, and these functions can be directly called by name without the need for configuration. Then users can easily add other transformation functions and they'll just work naturally. The common ones that we've come up with so far can ship with the core plugin. This could even lead to trimming out some of the ones that don't seem too vital, leaving them to users to define if needed.

What do you think?

qstrahl commented 10 years ago

That looks like a very flexible and helpful solution, and I can't spot any immediate flaws with it. :+1:

tpope commented 10 years ago

I want to allow for a bit more data to pour in before jumping to the implementation phase. The latest suggestion is still a bit hypothetical obsessed (uppercase/lowercase/camelcase) rather than practical (Cucumber feature case).

I like the curly brace notation. One weakness is that it doesn't allow for parameters, so it doesn't allow for arbitrary transformations like _ to - or / to a character we haven't thought of.

tommcdo commented 10 years ago

I considered that weakness. If it's decided to omit the ability to include parameters, it will almost surely come up as a feature request.

It could be solved by just defining another function. I'd imagine the complexity of supporting parameters outweighs the minor inconvenience of having to write another function (and it probably won't be needed too often).

sigmavirus24 commented 10 years ago

Python support could be performed loosely like Java support. Fortunately for projectile, Java has the convention one class per file while Python does not. That alone may make it more of a hassle for you since people will probably perpetually file bug reports complaining.

tpope commented 10 years ago

Had a bit of an aha moment today when I realized we could mitigate the need for parameters by allowing transformations to be composed of two functions. Something like {from-underscore|to-hyphen}. Behind the scenes the from function could swap out the target for some binary character, and the to function could target that binary character. And maybe we could surround the whole thing with an implicit from-slash and to-slash to shorten the most common case.

tpope commented 10 years ago

And thinking out loud here, we could probably just implement all possibilities as pairs of from/tos, although a few get a bit contrived. Here are a few examples using a from:to,from2:to2 style notation (identical to 'matchpairs'):

Clojure namespaces: {/:.,_:-}
Ruby classes: {/:colons,_:capitalize,start:capitalize}
PHP: {/:head,/:\} and {/:tail,_:capitalize,start:capitalize}

That PHP example has a bit of repetition, which we could get more clever about:

{/:head:\} and {/:tail,words:capitalize}

I think this is probably the direction to go in. Thoughts?

tommcdo commented 10 years ago

It's getting pretty hard to read. Might just be me, but my brain is starting to see pairs of comma-delimited values separated by colons.

Also, specifying the from seems kinda redundant, as we should know the initial form ahead of time. Plus, we basically still have zero-parameter transformations capable of exactly one thing.

Maybe I'm not seeing the big picture. I'll keep thinking about it for a while.

tpope commented 10 years ago

I don't see what's redundant about the from. {from-slash|to-hyphen} and {from-underscore|to-hyphen} are both valid. You could just make two different transformations, but breaking them apart changes the worst case scenario from n * m to n + m.

I think the real pattern I'm latching onto is that pretty much all of the transformations can be thought of some combination of "split", "transform", and "join". Character replacement is just split on one character and join on another. Head and tail split and join on slashes and transform by dropping some components. Camelization splits on _ (or maybe some other character), transforms by capitalizing each component, and joins on an empty string. The only real exception is inflection, but you can also think about that as having a split that just returns one element.

My thinking is continuing to evolve, but I do think having a way to reuse the slash modifications on other characters is going to be a piece of the final product.

tommcdo commented 10 years ago

I think I'm starting to agree. My original assumption was that the first input was always going to be a file path component, so the only delimiter to worry about was /. But even under the assumption that we always start with a filename, there are other from- characters to worry about -- for instance, since a lot of JavaScript code is done without a predefined filename-to-identifier mapping, one might decide to make up his own conventions: foo/bar-baz/qux might map to Foo.BarBaz.Qux. (That example might be a little contrived but it's not a far stretch.)

If I can offer a suggestion: to improve readability, maybe we can avoid using delimiter characters as transformation names, and prefer words. So slash instead of /. This will also ease the pain of transforming to or from , and :, which could get quite hard to read (for humans and computers alike).

tpope commented 10 years ago

I'm actually circling back to the earlier proposal, because my whole "split/transform/join" vision has some unanswered questions, and I don't see any reason why we can't gracefully transition to it later if/when those questions are answered.

The new system is documented in the help file, and I plan to axe the old system before 1.0. I think I've covered all the real world transformations except inflections (which best I can tell is a Rails only thing) and _ to `(because I haven't decided what to name it, sincespacewould conventionally be/to ). You'll have to build your own mixed case transformation out of{camelcase|capitalize}`.

tommcdo commented 10 years ago

:+1:

tommcdo commented 10 years ago

Just thought of something. {} is a feasible occurrence in code, e.g.

class Foo {}

Should there be a way to prevent this from being expanded?

tpope commented 10 years ago

{open}{close}

tommcdo commented 10 years ago

Guess I should have read the docs.

tpope commented 10 years ago

Added a couple more transformations, and calling this goose cooked. I'd still like to find a way to do arbitrary punctuation. Naming everything doesn't really scale well. On the plus side I still have yet to want to act on anything other than a / or a _, so maybe only the replacement has to be arbitrary.

tommcdo commented 10 years ago

I just ran into another transformation that might be generally useful. Prepend a leading / only when a:input is not empty. This would imply that something like dirname was called first. Here's the use case:

:Econtroller Foo should generate this:

<?php
namespace Controllers;
class Foo extends Controller {
}

:Econtroller Foo/Bar should generate this:

<?php
namespace Controllers\Foo;
class Bar extends Controller {
}

Since Controllers is part not part of the wildcard match, it has to be inserted directly into the template, but it needs decide whether or not to include a \ for sub-namespaces.

My .projections.json file looks kinda like this:

{
    "app/src/Controllers/*.php": {
        "command": "controller",
        "template": [
            "<?php",
            "namespace Controller{dirname|leading|backslash};",
            "class {basename} extends Controller {open}",
            "{close}"
        ]
    }
}

I've defined the leading transformation as follows:

function! g:projectile_transformations.leading(input, o)
    return substitute(a:input, '^\ze.\+$', '/', '')
endfunction

Do you think this would be useful for more cases?

tpope commented 10 years ago

Can you work around with a separate "app/src/Controllers*.php" projection?

tommcdo commented 10 years ago

I considered it, but it's really quite undesirable to include the leading / in the name. Then I'd also have to think about whether I should invoke it using the leading / or not, depending on whether there's a subdirectory.

Anyway, my g:projectile_transformations.leading extension works just fine for me, but I was wondering if it's general enough to make it into the core plugin. Admittedly it's kind of a weird one; it's only more useful than hard-coding the character directly in your template when there's a chance that the expansion is empty (meaning, as far as I can tell, you must have used dirname first).

tpope commented 10 years ago

I mean a separate projection with just the template.

tommcdo commented 10 years ago

Oh, that separate. I suppose it wouldn't be too bad to use the leading / when I know I'm creating one, but I'd still have to choose to omit it when creating a new class with no subdirectory. Unless I'm still missing something.

tpope commented 10 years ago

This objection is just about a bit of weirdness in the projection definitions, and not any usability weirdness, right? I think "leading" is kind of weird too, so I consider that a draw.

tommcdo commented 10 years ago

It is a usability weirdness, unless, again, I'm still missing something. Basically, in some cases the template needs to put a / and in some cases it doesn't. The former is determined by presence of a subdirectory, and the latter by absence. If I define one projection as "app/src/Controllers*.php", then it will take care of the cases where a / is needed in the template, but it can't be used for the cases where it's not needed. Likewise, the "app/src/Controllers/*.php" projection is good for when a / is not needed, but no good for when it's needed.

I'm imagining that the thing I'm missing is that projectile.vim will match one projection and use a template from another -- is this the case? Otherwise I'm just not seeing it.

tpope commented 10 years ago

By usability I mean things like navigation commands work the same, or that you literally can't create a configuration that supports all your requirements. {dirname} on "/foo" should return "" in our implementation, so as I understand it, a "app/src/Controllers*.php" should cover all controller cases, even if that is a bit inconsistent with "app/src/*.php".

"Leading" feels awkward, and I have a hunch subsequent generalization (assuming that actually happens) will shine a light on a better solution. If the only cost of kicking this down the road is you have to think a bit harder about when to use a slash when you first create your templates, I'm happy to pay it.

tommcdo commented 10 years ago

Ah, there's the thing I was missing. I'm satisfied for now. Thanks!

athaeryn commented 10 years ago

Could {camelcase} remove hyphens along with underscores?

tpope commented 10 years ago

What's the use case?

athaeryn commented 10 years ago

The convention for filenames on a project I'm working on is something like views/edit-step.coffee. If I want to make a new view with :Eview edit-your-mom, I'd like the template to produce something like:

App.Views.EditYourMom = Backbone.view.extend

tpope commented 10 years ago

Is it part of a framework or home grown? Adding either way but the latter carries significantly less weight if this conflicts with someone else's usage.

athaeryn commented 10 years ago

It's not a framework, but it's the convention on this project. Is there a way to define my own transformations? I don't see anything about it in the docs.

tommcdo commented 10 years ago

You can define your own like this:

if !exists('g:projectile_transformations')
    g:projectile_transformations = {}
endif

function! g:projectile_transformations.foo(input, o)
    return some_manipulation_of(a:input)
endfunction

This makes foo available as a transformation function.

tpope commented 10 years ago

Unofficial, fwiw. In particular the idea of sharing a .projections.json that depends on custom transformations strikes me as broken.

The camelcase change has been added, as was promised.

athaeryn commented 10 years ago

Right, but if I want another developer to be able to use it, it's not as simple as having the .projections.json file in the project anymore. It would be nice if you could do:

{
  "transformations": [
    "function! g:projectile_transformations.foo(input, o)",
    "  return some_manipulation_of(a:input)",
    "endfunction"
  ]
}

Or maybe something simpler even?

{
  "transformations": {
    "foo": "some_manipulation_of(a:input)"
  }
}

tpope commented 10 years ago

Arbitrary code evaluation aside, I don't want to tie the format to VimL.

athaeryn commented 10 years ago

Ah, good point.

tpope / vim-projectionist

Support for case transformations in placeholders #3